Ensemble Learning for Apache Spark 🌲
APACHE-2.0 License
Library of Meta-Estimators à la scikit-learn for Ensemble Learning for Apache Spark MLLib
A recommender system for discovering GitHub repos, built with Apache Spark
Ranking algorithms for Spark machine learning pipeline
MLeap: Deploy ML Pipelines to Production
A collection of Apache Parquet add-on modules
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteo...
Base classes to use when writing tests with Spark
Apache Spark - A unified analytics engine for large-scale data processing
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Expressive types for Spark.
Boiler plate framework to use Spark and ZIO together.
scala、spark使用过程中,各种测试用例以及相关资料整理
Lightning-fast cluster computing in Java, Scala and Python.
技術評論社「詳解Apache Spark」のサンプルコード
Simple and Distributed Machine Learning
A simple Spark-powered ETL framework that just works 🍺