Study project for big data (Hadoop, Zookeeper, Kafka, Flink, Spark)
MIT License
Supported Technologies:
- Hadoop 3.3.6 (with JDK 8.0.352-zulu, Maven 3.6.3)
- Zookeeper 3.9.2
- Kafka 2.12-3.7.1
git clone https://github.com/mcddhub/mcdd-big-data-study.git --depth=1 && cd mcdd-big-data-study
cd docker
docker build -t caobaoqi1029/big-data-study:x.x.x .
Note: Replace
x.x.x
with the appropriate version number.
docker compose up -d
docker exec -it master bash
hdfs namenode -format
start-all.sh
vim input.txt
hdfs dfs -put -f ./input.txt /
hdfs dfs -ls /
mvn clean package
cd target/
hadoop jar big-data.jar
Tip: You can set the environment variable to run Java directly:
export CLASSPATH=$CLASSPATH:/tmp/ # Add this to .bashrc for persistence.
hdfs dfs -ls /output
hdfs dfs -cat /output/part-r-00000
We welcome contributions! Feel free to submit a pull request. For more details, see the Contribution Guide.
This project is licensed under the MIT License. See the LICENSE file for details.
If you find this project helpful, consider giving it a ⭐️ on GitHub!