Quickstart PySpark with Anaconda on AWS/EMR
MIT License
This code should help to jump start PySpark with Anaconda on AWS.
conda env create -f environment.yml
config.yml.example
config.yml
python emr_loader.py
See LICENSE for details. Copyright (c) 2016 Dat Tran.
A simple Spark TDD example
A commandline tool for analysis of big biological data sets for distributed HPC clusters.
Apache Spark Machine Learning project using MLlib and Linear Regression on Databricks!
Spark examples
A Grafana-based application to assist Big Data infrastructure optimization initiatives where Spar...
This construct builds some elements for you to quickly launch an EMR Serverless application. Afte...
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, Qu...
Explore the capabilities of Amazon EMR Serverless by processing semi-structured review data with ...
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances...