Quickstart PySpark with Anaconda on AWS/EMR
MIT License
This construct builds some elements for you to quickly launch an EMR Serverless application. Afte...
Study Guide for AWS Big Data Speciality Certification
A reverse proxy server which allows secure connectivity to a Spark Connect server
👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC...
This project demonstrates data cleaning, processing with Apache Spark and Apache Flink, both loca...
This project integrates real-time data processing and analytics using Apache NiFi, Kafka, Spark, ...
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances...
A Grafana-based application to assist Big Data infrastructure optimization initiatives where Spar...
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, Qu...