Ensemble Learning for Apache Spark 🌲
APACHE-2.0 License
Ranking algorithms for Spark machine learning pipeline
Lightning-fast cluster computing in Java, Scala and Python.
Expressive types for Spark.
A simple Spark-powered ETL framework that just works 🍺
Apache Spark - A unified analytics engine for large-scale data processing
技術評論社「詳解Apache Spark」のサンプルコード
Boiler plate framework to use Spark and ZIO together.
MLeap: Deploy ML Pipelines to Production
A recommender system for discovering GitHub repos, built with Apache Spark
Spark extension for processing large-scale 3D data sets: Astrophysics, High Energy Physics, Meteo...
Base classes to use when writing tests with Spark
A collection of Apache Parquet add-on modules
scala、spark使用过程中,各种测试用例以及相关资料整理
C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Simple and Distributed Machine Learning