✅ hadoop eco system을 구성하고 파이프라인 제작합니다.
hadoop各组件使用,持续更新
Streaming data processing using Hadoop HDFS, Spark, Kafka, Minio, Elasticsearch
Dockerizing an Apache Spark Standalone Cluster
Data analytics pipeline built with Apache Spark and Hadoop for processing and analyzing large-sca...
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
How to Do-It-Yourself A Cluster for Spark & Hadoop
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
base docker compose to setup the data engineering env in local
This repo contains implementations of PySpark for real-world use cases for batch data processing,...
pyspark🍒🥭 is delicious,just eat it!😋😋
Apache Spark - A unified analytics engine for large-scale data processing
scala、spark使用过程中,各种测试用例以及相关资料整理
Apache Spark™ and Scala Workshops
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython /...
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Stre...