Cuda Ecosystem

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Community Repos

Experts

Visit Website View on GitHub

Created by: NVIDIA

Released: June 23, 2007

Keywords

gpu 159 python 89 pytorch 69 cpp 58 nvidia 57 machine-learning 50 deep-learning 48 opencl 36 gpgpu 31 rust 31

Languages

C++ 132 Python 122 Cuda 63 Rust 30 Jupyter Notebook 26 C 26 Shell 16 Dockerfile 12 Julia 11 Go 6

Licenses

MIT 158 APACHE-2.0 73 OTHER 55 GPL-3.0 41 BSD-3-CLAUSE 24 GPL-2.0 6 BSD-2-CLAUSE 5 CC0-1.0 4 LGPL-3.0 4 AGPL-3.0 2

https://github.com/SamuraiBUPT/CUDA_Code

Codes for learning cuda

11 Aug 2024 2

https://github.com/LambdaLabsML/distributed-training-guide

Best practices & guides on how to write distributed pytorch training code

31 Jul 2024 190

https://github.com/Kentakoong/mtnlog

A simple multinode performance logger for Python

29 Jul 2024 0

https://github.com/neoheartbeats/neoheartbeats-kernel

An architecture for LLMs' continual-learning and long-term memories

26 Jul 2024 4

PyAV-CUDA

Extension of PyAV (ffmpeg bindings) with hardware decoding support

15 Jul 2024 0

tinyGPUlang

Tutorial on building a gpu compiler backend in LLVM

14 Jul 2024 7

whisper-onnx-python

A low-footprint GPU accelerated Speech to Text Python package for the Jetpack 5 era bolstered by an optimized graph

24 Jun 2024 1

QuickCluster

A KMeans implemented in C++ with Python bindings and GPU acceleration

11 Apr 2024 6

https://github.com/jamjamjon/usls

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models

29 Mar 2024 34

https://github.com/viktour19/culingam

CULiNGAM accelerates LiNGAM analysis on GPUs

07 Feb 2024 7

jpeggpu

Low-latency CUDA JPEG decoder by parallelizing Huffman decoding

20 Nov 2023 5

ezlocalai

ezlocalai is an easy to set up local artificial intelligence server with OpenAI Style Endpoints

02 Oct 2023 72

https://github.com/TensorBFS/CuTropicalGEMM.jl

The fastest Tropical number matrix multiplication on GPU

29 Jun 2023 1

RadonKA.jl

A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions

20 Jun 2023 7

llama-dfdx

LLaMa 7b with CUDA acceleration implemented in rust

28 Apr 2023 100

https://github.com/mantasu/glasses-detector

Glasses detection, classification and segmentation

06 Mar 2023 49

autoregressive-linear-attention-cuda

CUDA implementation of autoregressive linear attention, with all the latest research findings

07 Feb 2023 43

https://github.com/romnn/nvbit-rs

Rust bindings to the NVIDIA NVBIT binary instrumentation API

08 Nov 2022 2

dockerdl

Deep Learning Docker Image

13 Sep 2022 73

EasyAI

Make your own AI easily !

19 Aug 2022 2