This project demonstrates data cleaning, processing with Apache Spark and Apache Flink, both locally and on AWS EMR.
Project FraudCatch leverages AI to predict and prevent financial fraud in real-time. It uses Apac...
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, ...
Udacity Data Engineering Nano Degree (DEND)
Quickstart PySpark with Anaconda on AWS/EMR
Study Guide for AWS Big Data Speciality Certification
A data engineering training project to build an end-to-end pipline for a real-time processing of ...
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, Qu...
More than 2000+ Data engineer interview questions.
A Grafana-based application to assist Big Data infrastructure optimization initiatives where Spar...
Projects done in the Data Engineering Nanodegree by Udacity.com
This project integrates real-time data processing and analytics using Apache NiFi, Kafka, Spark, ...
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances...
👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC...
This construct builds some elements for you to quickly launch an EMR Serverless application. Afte...