Projects with this topic
-
Automated Urban Heat Island (UHI) data pipeline using ERA5 climate data and OpenStreetMap, orchestrated with Apache Airflow, processed with GeoPandas/xarray, stored in PostGIS, and visualized with Folium.
Updated -
Real-time stock market data lakehouse using Kafka, Bronze/Silver/Gold architecture, and Postgres feature serving.
Updated -
End-to-end IoT data platform on Kubernetes (kind) using FastAPI, MongoDB, and a background processor with a telemetry simulator. Includes Terraform IaC example.
Updated -
End-to-end AWS data lake pipeline for fleet telemetry data using S3, Spark, and Athena. Includes partitioned Parquet ETL, vehicle safety analytics, and SQL queries for overspeed and harsh braking detection.
Updated -
Real-time connected vehicle telemetry pipeline using Kafka, MongoDB, FastAPI, and Grafana with anomaly detection and fleet monitoring dashboards.
Updated -
Distributed geospatial data pipeline using Apache Spark and Apache Sedona to analyze NYC taxi demand hotspots from millions of trip records.
Updated -
Multi-source geospatial ETL pipeline integrating bike lanes, traffic sensors, and GTFS data to analyze urban mobility infrastructure coverage in Berlin.
Updated -
Fundamental theory and practice in Data Science (DS).
🧮 data analysis AI ML DL machine lear... deep learning data science data-enginee... artificial i... data-science data preproc... Python C C++ NumPy pandas mathematics Algorithm algorithms Data Enginee... big data scipy scikit-learn xgboost lightgbm catboost TensorFlow keras PyTorch matplotlib seaborn plotly nltk opencv dask linear-algebra calculus probability statistics Discrete Mat... RUpdated -
An automated data pipeline for migrating and synchronizing patient records from HOSxP (MySQL) to the Buddy Care platform, featuring SQL optimization and data integrity validation for healthcare services.
Updated -
Solución end-to-end para la migración y análisis de datos utilizando Python, FastAPI, Kafka y PostgreSQL. Implementa un pipeline de datos asíncrono y una API RESTful para analíticas, todo completamente containerizado con Docker Compose para un despliegue fácil y reproducible.
Updated -
End-to-end design of a Hadoop-based ecosystem for healthcare data at scale (50 TB, IoT streams, medical imaging). Proposed a 10-node cluster architecture integrating HDFS, Spark, Hive, NiFi, Kafka, and Docker with HIPAA-compliant security (Kerberos, TLS, Apache Ranger). Delivered a proof-of-concept Docker deployment and professional proposal document.
Updated -
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
See the documentation at: https://airflow-dbt-python.readthedocs.io/
Updated