Cuda Ecosystem

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Community Repos

5.3K

Experts

549

Visit Website View on GitHub

Created by: NVIDIA

Released: June 23, 2007

QuickCluster

A KMeans implemented in C++ with Python bindings and GPU acceleration

11 Apr 2024 6

https://github.com/jamjamjon/usls

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models

29 Mar 2024 34

https://github.com/viktour19/culingam

CULiNGAM accelerates LiNGAM analysis on GPUs

07 Feb 2024 7

jpeggpu

Low-latency CUDA JPEG decoder by parallelizing Huffman decoding

20 Nov 2023 5

https://github.com/BrosnanYuen/RayBNN_Optimizer

Gradient Descent Optimizers and Genetic Algorithms using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

19 Oct 2023 2

docker-cuda-desktop

Ubuntu PyTorch CUDA Docker image with KDE Plasma Desktop & VNC

16 Oct 2023 11

https://github.com/NVIDIA/nvImageCodec

A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface

04 Oct 2023 58

ezlocalai

ezlocalai is an easy to set up local artificial intelligence server with OpenAI Style Endpoints

02 Oct 2023 72

https://github.com/MrNeRF/gaussian-splatting-cuda

3D Gaussian Splatting, reimagined: Unleashing unmatched speed with C++ and CUDA from the ground up!

30 Jul 2023 862

ScaleLLM

A high-performance inference system for large language models, designed for production environments

24 Jul 2023 289

bmf

Cross-platform, customizable multimedia/video processing framework

15 Jul 2023 773

https://github.com/TensorBFS/CuTropicalGEMM.jl

The fastest Tropical number matrix multiplication on GPU

29 Jun 2023 1

RadonKA.jl

A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions

20 Jun 2023 7

llama-dfdx

LLaMa 7b with CUDA acceleration implemented in rust

28 Apr 2023 100

https://github.com/mantasu/glasses-detector

Glasses detection, classification and segmentation

06 Mar 2023 49

autoregressive-linear-attention-cuda

CUDA implementation of autoregressive linear attention, with all the latest research findings

07 Feb 2023 43

willow-inference-server

Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS

03 Feb 2023 375

PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis

11 Jan 2023 37

https://github.com/DefTruth/CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm

17 Dec 2022 1,308

https://github.com/romnn/nvbit-rs

Rust bindings to the NVIDIA NVBIT binary instrumentation API

08 Nov 2022 2

Keywords

Languages

Licenses