Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)
APACHE-2.0 License
A collection of GICP-based fast point cloud registration algorithms
CUDA C++ Core Libraries
Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with on...
An architecture for LLMs' continual-learning and long-term memories
3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!
A high-performance inference system for large language models, designed for production environments.
ThunderGBM: Fast GBDTs and Random Forests on GPUs
NumPy实现类PyTorch的动态计算图和神经网络框架(MLP, CNN, RNN, Transformer)
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT...
Some CUDA design patterns and a bit of template magic for CUDA
Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mi...
An unofficial Julia wrapper for the RAPIDS.ai ecosystem using PythonCall.jl
Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
Object detection for video surveillance
Simple tests for JAX, PyTorch, and TensorFlow to test if the installed NVIDIA drivers are being p...