Projects with this topic
-
This is a study project built using FastAPI to practice microservice architecture, data normalization techniques, and clean API design.
The service receives raw payloads from different simulated sources and transforms them into a standardized and validated structure.
It centralizes normalization logic and demonstrates how to build a scalable, maintainable, and test-friendly data processing layer.
Updated -
Configuration and data workflows for an instance of Apache Airflow for the DDRplatform
Updated -
DataRider bloc for ETL Stream with Spark+Scala
Updated -
Solución end-to-end para la migración y análisis de datos utilizando Python, FastAPI, Kafka y PostgreSQL. Implementa un pipeline de datos asíncrono y una API RESTful para analíticas, todo completamente containerizado con Docker Compose para un despliegue fácil y reproducible.
Updated -
This project/library contains common elements related to ETL processes...
Updated -
Unified project demonstrating both batch analytics and real-time streaming pipelines with Apache Spark:
Batch (PySpark/Jupyter): Processed S&P 500 stock data, applied transformations, and ran distributed computations.
Streaming (Spark + Kafka): Built a streaming pipeline to consume Kafka topics, process messages in real-time, and visualize outputs.
Deployed using Docker and Jupyter for reproducibility.
Updated -
Analyzed decades of historical weather station data (1920–1940) using Hadoop MapReduce. Filtered operable stations, computed descriptive statistics (min, max, mean, median), and produced reports/graphs. Designed modular MRJobs to chain tasks together for scalable processing.
Updated -
Advanced data synchronization framework.
Updated -
Reporting for MIT Club of Northern California
Updated -
В данном проекте находятся два задания, написанные на Python и реализующие выполнение цепочек задач (DAG) в среде Airflow
Updated -
-
Crawl and extract Home Depot's schema.org/Products.
Updated -
Official weather data ETL for wind energy project evaluation in El Calafate, Argentina.
Updated -
-
target-core is a Singer Target which intend to work with regular Singer Tap. The Goal is to use this package as a foundation to build other targets focusing on the core features, reducing the energy spent on maintaining the common parts.
Updated -
ETL to get energy load shapes for dataport pecanstreet.
Updated