Pyspark + pandas. This may get merged into the SparklingPandas project.
Apache Spark - A unified analytics engine for large-scale data processing
My notebook on using Python with Jupyter Notebook, PySpark etc
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
A repository to keep track of all the code that I end up writing for my blog posts.
Fundamentals of Spark with Python (using PySpark), code examples
Apache Spark Machine Learning project using MLlib and Linear Regression on Databricks!
A simple example for PySpark based project.
Pandas and Spark DataFrame comparison for humans and more!
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython /...
A GUI for Pandas DataFrames
This repo contains implementations of PySpark for real-world use cases for batch data processing,...
pronounced sUrplus as it's simply better if not best!
Pyspark RDD, DataFrame and Dataset Examples in Python language