Gradient Descent Optimizers and Genetic Algorithms using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI
GPL-3.0 License
An architecture for LLMs' continual-learning and long-term memories
Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations
Weighted MinHash implementation on CUDA (multi-gpu).
Programmable CUDA/C++ GPU Graph Analytics
NumPy实现类PyTorch的动态计算图和神经网络框架(MLP, CNN, RNN, Transformer)
A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbst...
A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources
This repository lists some awesome public Rust projects, Videos, Blogs and Jobs.
C++ library for solving large sparse linear systems with algebraic multigrid method
An unofficial Julia wrapper for the RAPIDS.ai ecosystem using PythonCall.jl
CUDA C++ Core Libraries
3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!
The fastest Tropical number matrix multiplication on GPU
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-L...
Some CUDA design patterns and a bit of template magic for CUDA