Skylogix Transportation is a logistics and delivery company managing a fleet of over 1200 delivery trucks. The company aims to focus on ensuring timely deliveries, minimizing risks, and optimizing routes in diverse African markets.
This project implements an end-to-end data pipeline for ingesting, staging, replicating, and modeling weather API data for analytics and dashboarding use cases. The pipeline uses MongoDB as a staging database, a Python ingestion service for polling external weather APIs, Airbyte (Docker) for data replication, and DuckDB as the analytical warehouse.
The ingestion service:
- Periodically polls one or more weather APIs
- Normalizes API responses
- Performs upserts into MongoDB to avoid duplicates
- Tracks record freshness using an updatedAt timestamp
Key features
- Idempotent
- Handles updates to existing weather observations
- Designed to be schedulable (cron / Airflow-ready)
MongoDB acts as a raw but structured staging layer, optimized for:
- Flexible schema evolution from APIs
- Fast upserts
- Reliable CDC-style replication via Airbyte
Document Structure:
- Orchestrate ingestion and syncs with Airflow
- Add data quality checks (freshness, nulls, anomalies)
- Partition analytical tables by date
- Extend to multiple weather API providers
- Publish dashboards using Metabase or Superset

