Monitor the stability of a Pandas or Spark dataframe ⚙︎
MIT License
Code Library for My Blog
Python clone of Spark, a MapReduce alike framework in Python
Apache TinkerPop - a graph computing framework
PySpark + Scikit-learn = Sparkit-learn
Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
Koalas: pandas API on Apache Spark
IT Knowledge Base from 20 years in DevOps, Linux, Cloud, Big Data, AWS, GCP etc - gradually porti...
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Ka...
List of Data Science Cheatsheets to rule the world
Simple and Distributed Machine Learning
Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Some Data Science examples using Groovy
cube studio开源云原生一站式机器学习/深度学习AI平台,支持sso登录,多租户/多项目组,大数据平台对接,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练...
Apache Avro is a data serialization system.
Data Lakehouse local stack with PySpark, Trino, and Minio. Includes an example to process Raygun ...