Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
MIT License
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines ...
A collection of code snippets from the publication Daily Dose of Data Science on Substack: http:/...
A repository to keep track of all the code that I end up writing for my blog posts.
Collection of useful data science topics along with articles, videos, and code
Type System for Data Analysis in Python
Carefully curated resource links for data science in one place
VILA - a multi-image visual language model with training, inference and evaluation recipe, deploy...
Complete Roadmap For Data Science
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Visualize and compare datasets, target values and associations, with one line of code.
A terminal spreadsheet multitool for discovering and arranging data
Some fundamental machine learning and data-analysis techniques are explained through realistic ex...
Data Science Roadmap from A to Z