Build a complex spark execution plan by composing many different spark operations.
APACHE-2.0 License
The Internals of Apache Spark
ETL pipeline using pyspark (Spark - Python)
Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPC...
SparkSQL.jl enables Julia programs to work with Apache Spark data using just SQL.
scala、spark使用过程中,各种测试用例以及相关资料整理
Sample analysis for the latest yelp dataset using spark
Apache Spark - A unified analytics engine for large-scale data processing
A tool for monitoring and tuning Spark jobs for efficiency.
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is c...
Haskell on Apache Spark.
Implementing core components of a data-driven architecture using Spark: Data Management and Data ...
A simple Spark-powered ETL framework that just works 🍺
Ensemble Learning for Apache Spark 🌲
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteo...
The Internals of Spark SQL