An open source multi-tool for exploring and publishing data
A personal project that builds an end-to-end data pipeline using the 2024 Olympics data.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Data Lakehouse local stack with PySpark, Trino, and Minio. Includes an example to process Raygun ...
♃ Debian packaging of JupyterHub, a multi-user server for Jupyter notebooks
A Machine Learning API with native redis caching and export + import using S3. Analyze entire dat...
Library to create Dagster jobs from YAML
DagsHub client libraries
🌳 WALD Stack Demo 🏎️
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Software and instructions for setting up and running a self-driving lab (autonomous experimentati...
🛠 All-in-one web-based IDE specialized for machine learning and data science.
Open-source Platform for Scientific and Technical Data Processing and Visualization
An orchestration platform for the development, production, and observation of data assets.
The fastest way to iterate and deploy AI workloads on your own infra. Unobtrusive, debuggable, Py...