A Docker Compose-based IoT data pipeline for local development, featuring MQTT, MinIO, Cassandra, FastAPI, and Airflow for easy testing and expansion.
MIT License
This repository helps you understand the basic components needed to build a data pipeline for IoT data and how they work together. Use this setup to test individual components or see how they function as a complete system. You can also expand this setup to create a more complex pipeline and deploy it to cloud platforms like AWS, Azure, or Google Cloud.
I chose Docker Compose for local deployment to focus on understanding the components and their interactions without the complexity of cloud providers. This approach also makes it easy to share the setup and run it on any machine with minimal effort.
The pipeline and infrastructure include:
The components are connected as follows:
Once you have clean data in the database, you can use it for analytics, machine learning, or other applications.
Prerequisites Component Testing
git clone https://github.com/daleonpz/iot_cloud_test.git
cd iot_cloud_test
cd mqtt
docker build -t my-broker .
docker run -d --name my-broker -p 1883:1883 my-broker
docker exec -it my-broker mosquitto_sub -h localhost -t test
In another terminal:
docker exec -it my-broker mosquitto_pub -h localhost -t test -m "hello"
cd datalake
docker build -t my-datalake .
docker run -d --name my-datalake -p 9000:9000 -e "MINIO_ACCESS_KEY=minio" -e "MINIO_SECRET_KEY=minio123" my-datalake server /data --console-address ":9001"
Open http://localhost:9000 in your browser.
If not accessible via localhost, use the container's IP address:
docker logs my-datalake
cd database
docker build -t my-db .
docker run -d --name my-db -p 9042:9042 my-db
docker exec -it my-db cqlsh localhost
CREATE KEYSPACE iot WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
USE iot;
CREATE TABLE measurements (id UUID PRIMARY KEY, temperature float, battery_level float);
INSERT INTO measurements (id, temperature, battery_level) VALUES (uuid(), 25.0, 50.0);
SELECT * FROM measurements;
cd restapi
docker build -t api .
docker run -d --name api -p 8000:8000 --link my-db:my-db api
For debugging:
docker run -it --name api -p 8000:8000 --link my-db:my-db api bash
curl -X GET "http://localhost:8000/data/{id}" -H "accept: application/json" -d '{"temperature": 25.0, "battery_level": 50.0}'
curl -X POST "http://localhost:8000/data/{id}" -H "accept: application/json"
docker-compose -f docker-compose.yml.etl_test up --build
docker exec -it my-db cqlsh localhost
USE iot;
SELECT * FROM measurements;
docker-compose -f docker-compose.yml.mqtt_app_test up --build
cd mqtt/
python mqtt_publisher_test.py
docker-compose -f docker-compose.yml up --build
cd mqtt/
python mqtt_publisher_test.py
Log in to http://localhost:8080 with:
Trigger the DAG:
docker exec -it my-db cqlsh localhost
Run the following commands in cqlsh:
USE iot;
SELECT * FROM measurements;
./tools/delete_containers.sh
./tools/delete_docker_images.sh
.env
file in the root directory that sets environment variables for the services. You can modify this file as needed.