A pure Python implementation of Apache Spark's RDD and DStream interfaces.
OTHER License
PySpark + Scikit-learn = Sparkit-learn
pyspark🍒🥭 is delicious,just eat it!😋😋
PySpark-Tutorial provides basic algorithms using PySpark
A collection of utilities for handling pySpark's SparkContext
Haskell on Apache Spark.
A Pyspark companion for data science tasks.
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteo...
Jupyter magics and kernels for working with remote Spark clusters
Fundamentals of Spark with Python (using PySpark), code examples
Apache Spark - A unified analytics engine for large-scale data processing
Apache Spark (PySpark) Practice on Real Data
My notebook on using Python with Jupyter Notebook, PySpark etc
A pure python mock of pyspark's RDD
Asynchronous actions for PySpark
A repository to keep track of all the code that I end up writing for my blog posts.