Projects with this topic
-
Automated Urban Heat Island (UHI) data pipeline using ERA5 climate data and OpenStreetMap, orchestrated with Apache Airflow, processed with GeoPandas/xarray, stored in PostGIS, and visualized with Folium.
Updated -
End-to-end IoT data platform on Kubernetes (kind) using FastAPI, MongoDB, and a background processor with a telemetry simulator. Includes Terraform IaC example.
Updated -
End-to-end AWS data lake pipeline for fleet telemetry data using S3, Spark, and Athena. Includes partitioned Parquet ETL, vehicle safety analytics, and SQL queries for overspeed and harsh braking detection.
Updated -
Real-time connected vehicle telemetry pipeline using Kafka, MongoDB, FastAPI, and Grafana with anomaly detection and fleet monitoring dashboards.
Updated -
Multi-source geospatial ETL pipeline integrating bike lanes, traffic sensors, and GTFS data to analyze urban mobility infrastructure coverage in Berlin.
Updated -
An automated data pipeline for migrating and synchronizing patient records from HOSxP (MySQL) to the Buddy Care platform, featuring SQL optimization and data integrity validation for healthcare services.
Updated -
Solución end-to-end para la migración y análisis de datos utilizando Python, FastAPI, Kafka y PostgreSQL. Implementa un pipeline de datos asíncrono y una API RESTful para analíticas, todo completamente containerizado con Docker Compose para un despliegue fácil y reproducible.
Updated -
End-to-end design of a Hadoop-based ecosystem for healthcare data at scale (50 TB, IoT streams, medical imaging). Proposed a 10-node cluster architecture integrating HDFS, Spark, Hive, NiFi, Kafka, and Docker with HIPAA-compliant security (Kerberos, TLS, Apache Ranger). Delivered a proof-of-concept Docker deployment and professional proposal document.
Updated -
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
See the documentation at: https://airflow-dbt-python.readthedocs.io/
Updated