Weighted matrix factorization on the GPU with Theano and scikits.cuda
MIT License
Statistics for this project are still being loaded, please check back later.
Train fastai models faster (and other useful tools)
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
A rudimentary wrapper around the fast Maxwell kernels for GEMM and convolution operations provide...
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners
Convolution op for Theano based on CuFFT using scikits.cuda
Matrix Factorization Library
Rapid large-scale fractional differencing with NVIDIA RAPIDS and GPU to minimize memory loss whil...
Weighted matrix factorization in Python
Explore training for quantized models
Experiments for the blog post "No, We Don't Have to Choose Batch Sizes As Powers Of 2"
minimal pytorch implementation of bm25 (with sparse tensors)
Some preliminary explorations of Mamba's context scaling.
Fast Python Collaborative Filtering for Implicit Feedback Datasets
Theano implementation of different optimization algorithms