This project implements an ETL (Extract, Transform, Load) process to migrate and transform data from an OLTP (Online Transaction Processing) system to a star schema in a data warehouse. The ETL process is written in Kotlin and Spark, and it is orchestrated using Apache Airflow.
ETL Processing of MySQL tables:
Data Transformation:
Orchestration:
git clone https://github.com/MFurmanczyk/wh-sales.git
cd wh-sales
./gradlew shadowJar
cd airflowdocker-compose up
docker-compose up -d
dag.py
to Airflow's DAGs folder.Access the Airflow UI at http://localhost:8080 and trigger the ETL DAG (sales_dag).
This project is licensed under the MIT License - see the LICENSE file for details.