Tutorial on building a gpu compiler backend in LLVM
MIT License
The fastest Tropical number matrix multiplication on GPU
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofl...
ILGPU JIT Compiler for high-performance .Net GPU programs
Computer vision library with focus on heterogeneous systems
Rust bindings to the NVIDIA NVBIT binary instrumentation API
A highly optimised C++ library for mathematical applications and neural networks.
CUDA C++ Core Libraries
A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources
3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!
Simple tests for JAX, PyTorch, and TensorFlow to test if the installed NVIDIA drivers are being p...
Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations
cuda编程学习入门
BQN virtual machine
SDK for GPU accelerated genome assembly and analysis
Some CUDA design patterns and a bit of template magic for CUDA