Cuda Ecosystem

Community Repos
5.3K
Experts
549
Created by: NVIDIA
Released: June 23, 2007

QuickCluster

A KMeans implemented in C++ with Python bindings and GPU acceleration

11 Apr 2024 6

https://github.com/jamjamjon/usls

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models

29 Mar 2024 34

https://github.com/viktour19/culingam

CULiNGAM accelerates LiNGAM analysis on GPUs

07 Feb 2024 7

jpeggpu

Low-latency CUDA JPEG decoder by parallelizing Huffman decoding

20 Nov 2023 5

https://github.com/BrosnanYuen/RayBNN_Optimizer

Gradient Descent Optimizers and Genetic Algorithms using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

19 Oct 2023 2

docker-cuda-desktop

Ubuntu PyTorch CUDA Docker image with KDE Plasma Desktop & VNC

16 Oct 2023 11

https://github.com/NVIDIA/nvImageCodec

A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface

04 Oct 2023 58

ezlocalai

ezlocalai is an easy to set up local artificial intelligence server with OpenAI Style Endpoints

02 Oct 2023 72

https://github.com/MrNeRF/gaussian-splatting-cuda

3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!

30 Jul 2023 862

ScaleLLM

A high-performance inference system for large language models, designed for production environments

24 Jul 2023 289

bmf

Cross-platform, customizable multimedia/video processing framework

15 Jul 2023 773

https://github.com/TensorBFS/CuTropicalGEMM.jl

The fastest Tropical number matrix multiplication on GPU

29 Jun 2023 1

RadonKA.jl

A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions

20 Jun 2023 7

llama-dfdx

LLaMa 7b with CUDA acceleration implemented in rust

28 Apr 2023 100

https://github.com/mantasu/glasses-detector

Glasses detection, classification and segmentation

06 Mar 2023 49

autoregressive-linear-attention-cuda

CUDA implementation of autoregressive linear attention, with all the latest research findings

07 Feb 2023 43

willow-inference-server

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

03 Feb 2023 375

PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis

11 Jan 2023 37

https://github.com/DefTruth/CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm

17 Dec 2022 1,308

https://github.com/romnn/nvbit-rs

Rust bindings to the NVIDIA NVBIT binary instrumentation API

08 Nov 2022 2