Statistics for this project are still being loaded, please check back later.
Implementing best practices for PySpark ETL jobs and applications.
pyspark🍒🥭 is delicious,just eat it!😋😋
Fundamentals of Spark with Python (using PySpark), code examples
This repo contains implementations of PySpark for real-world use cases for batch data processing,...
Apache Spark Machine Learning project using MLlib and Linear Regression on Databricks!
APACHE SPARK: Data Analysis, Transformation, and Visualisation with PySpark, IPL Data Analysis
Jupyter magics and kernels for working with remote Spark clusters
Implementing core components of a data-driven architecture using Spark: Data Management and Data ...
A commandline tool for analysis of big biological data sets for distributed HPC clusters.
Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragment...
Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPC...
A free tutorial for Apache Spark.
A simple VS Code devcontainer setup for local PySpark development
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteo...