Open Source LeetCode for PySpark, Spark, Pandas and DBT/Snowflake
APACHE-2.0 License
A boilerplate for spark projects with docker support for local development and scripts for emr su...
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Stre...
A free tutorial for Apache Spark.
Apache Spark docker image
A Python package to submit and manage Apache Spark applications on Kubernetes.
Experiment tracking server focused on speed and scalability
Dockerizing an Apache Spark Standalone Cluster
Apache Spark on AWS Lambda
Master's thesis on Big Data
Jupyter magics and kernels for working with remote Spark clusters
Apache Spark - A unified analytics engine for large-scale data processing
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a ric...
Data Lakehouse local stack with PySpark, Trino, and Minio. Includes an example to process Raygun ...
This is a comprehensive solution for real-time football analytics, leveraging Apache Spark execut...
Dockerizing and Consuming an Apache Livy environment