Udacity Data Engineering Nanodegree Program, Data Pipeline with Airflow project using MinIO and Postgresql.
Udacity Data Engineering Nanodegree Program, Data Pipeline with Airflow project using MinIO and Postgresql.
dags: Directory containing Airflow DAG scripts.
data: Directory for storing project source data.
plugins: Directory for custom Airflow plugins.
operators: Subdirectory for operator scripts.
helpers: Subdirectory for helper scripts.
$ git clone https://github.com/akarce/Udacity-Data-Pipeline-with-Airflow
$ cd Udacity-Data-Pipeline-with-Airflow
$ docker-compose up airflow-init
$ docker compose up -d
$ docker exec -it postgresdev psql -U postgres_user -d postgres_db
$ CREATE DATABASE sparkifydb;
Airflow webserver
http://localhost:8080/
MinIO
http://localhost:9001/
Username:
airflow
Password:
airflow
Connection Id: postgres_conn Connection Type: Postgres Host: 172.18.0.1 Database: sparkifydb Login: postgres_user Password: postgres_password Port: 5432
Go to MinIO WebUI using http://localhost:9001/ From Access Keys section -> create access key -> Create Store you Access Key and Secret Key using Download for Import
Connection Id: minio_conn Connection Type: Amazon Web Services AWS Access Key ID: <your_key_id> AWS Secret Access Key: <your_secret_key> \
This will create two buckets named udacity-dend, processed and 7 postgres tables named artists, songplays, songs, staging_events, staging_songs, times, users
$ \d <table_name>
Username:
minioadmin
Password:
minioadmin
=======
1b712ebd7a3d8fd64ccb00116f6c053a5f10cde7