This project extracts raw sales data from CSV files, applies data quality transformations using Pandas, and loads clean data into a PostgreSQL database. The pipeline is idempotent — it can run ...
In December 2025, we shared the first-ever The State of Trusted Open Source report, featuring insights from our product data and customer base on open source consumption across our catalog of ...
Implemented pandas-based cleaning rules in data_preprocessing.py, transformations for salesorder.csv → clean_salesorder.csv, pipeline testing via multiple DAG runs.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results