Low-latency CUDA JPEG decoder by parallelizing Huffman decoding
MIT License
Statistics for this project are still being loaded, please check back later.
Simple tests for JAX, PyTorch, and TensorFlow to test if the installed NVIDIA drivers are being p...
Some CUDA design patterns and a bit of template magic for CUDA
Weighted MinHash implementation on CUDA (multi-gpu).
Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mi...
Real-time large scale dense visual SLAM system
SYCL accelerated BLAKE3 Hash Implementation
Real-time dense visual SLAM system
A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface
Computer vision library with focus on heterogeneous systems
Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the ...
Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with on...
ILGPU JIT Compiler for high-performance .Net GPU programs
A library to create kaleidoscope effect on images with CUDA. You can build on all platforms using...
GPU Framework for Radio Astronomical Image Synthesis
A collection of GICP-based fast point cloud registration algorithms