Cuda Ecosystem

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Community Repos

Experts

Visit Website View on GitHub

Created by: NVIDIA

Released: June 23, 2007

Keywords

gpu 159 python 89 pytorch 69 cpp 58 nvidia 57 machine-learning 50 deep-learning 48 opencl 36 gpgpu 31 rust 31

Languages

C++ 132 Python 122 Cuda 63 Rust 30 Jupyter Notebook 26 C 26 Shell 16 Dockerfile 12 Julia 11 Go 6

Licenses

MIT 158 APACHE-2.0 73 OTHER 55 GPL-3.0 41 BSD-3-CLAUSE 24 GPL-2.0 6 BSD-2-CLAUSE 5 CC0-1.0 4 LGPL-3.0 4 AGPL-3.0 2

https://github.com/MAJ0RRR/parallel-processing-cpu-and-gpu-env-and-lib-with-powercap

(2024/2025) A library and environment for parallel processing in a power-limited CPU+GPU cluster environment

16 Aug 2024 2

https://github.com/Bruce-Lee-LY/decoding_attention

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference

14 Aug 2024 14

https://github.com/LambdaLabsML/distributed-training-guide

Best practices & guides on how to write distributed pytorch training code

31 Jul 2024 190

QuickCluster

A KMeans implemented in C++ with Python bindings and GPU acceleration

11 Apr 2024 6

https://github.com/BrosnanYuen/RayBNN_Optimizer

Gradient Descent Optimizers and Genetic Algorithms using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

19 Oct 2023 2

docker-cuda-desktop

Ubuntu PyTorch CUDA Docker image with KDE Plasma Desktop & VNC

16 Oct 2023 11

https://github.com/NVIDIA/nvImageCodec

A nvImageCodec library of GPU- and CPU- accelerated codecs featuring a unified interface

04 Oct 2023 58

ScaleLLM

A high-performance inference system for large language models, designed for production environments

24 Jul 2023 289

bmf

Cross-platform, customizable multimedia/video processing framework

15 Jul 2023 773

RadonKA.jl

A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions

20 Jun 2023 7

https://github.com/mantasu/glasses-detector

Glasses detection, classification and segmentation

06 Mar 2023 49

PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis

11 Jan 2023 37

cudarc

Safe rust wrapper around CUDA toolkit

16 Sep 2022 597

squad-mortar-helper

💣 SMH – a computer vision project for automatic, precision mortar strike calculations in Squad

24 May 2022 19

hpc

My experiments with MPI and OpenMP

04 May 2022 3

blake3

SYCL accelerated BLAKE3 Hash Implementation

06 Jan 2022 10

https://github.com/jatinx/PyHIP

Python Interface to HIP and hiprtc Library

31 Oct 2021 6

https://github.com/UpsettingBoy/gpgpu-rs

Simple experimental async GPGPU framework for Rust

12 Aug 2021 145

gpufetch

Simple yet fancy GPU architecture fetching tool

10 Aug 2021 133

librapid

A highly optimised C++ library for mathematical applications and neural networks

25 May 2021 163