Cuda Ecosystem

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Community Repos

Experts

Visit Website View on GitHub

Created by: NVIDIA

Released: June 23, 2007

Keywords

gpu 162 python 89 pytorch 69 cpp 58 nvidia 58 machine-learning 50 deep-learning 48 opencl 36 rust 31 tensorflow 31

Languages

C++ 131 Python 123 Cuda 65 Rust 30 Jupyter Notebook 26 C 26 Shell 17 Dockerfile 13 Julia 11 Go 6

Licenses

MIT 159 APACHE-2.0 73 OTHER 56 GPL-3.0 41 BSD-3-CLAUSE 24 GPL-2.0 6 BSD-2-CLAUSE 5 CC0-1.0 4 LGPL-3.0 4 AGPL-3.0 3

https://github.com/Qervas/cn_chess_ai

chinese chess(Xiangqi) AI

18 Sep 2024 1

https://github.com/Bruce-Lee-LY/decoding_attention

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference

14 Aug 2024 14

https://github.com/dancing-ui/uestc_vhm

使用yolov8、fast-reid、deepsort完成目标跟踪

10 Aug 2024 6

https://github.com/ZephirFXEC/HNanoSolver

Houdini GPU Fluid Solver powered by NanoVDB

29 Jul 2024 7

tinyGPUlang

Tutorial on building a gpu compiler backend in LLVM

14 Jul 2024 7

KuiperLLama

校招、秋招、春招、实习好项目，带你从零动手实现支持LLama的大模型推理框架。

25 Apr 2024 191

QuickCluster

A KMeans implemented in C++ with Python bindings and GPU acceleration

11 Apr 2024 6

jpeggpu

Low-latency CUDA JPEG decoder by parallelizing Huffman decoding

20 Nov 2023 5

ScaleLLM

A high-performance inference system for large language models, designed for production environments

24 Jul 2023 289

bmf

Cross-platform, customizable multimedia/video processing framework

15 Jul 2023 773

PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofline model analysis

11 Jan 2023 37

hpc

My experiments with MPI and OpenMP

04 May 2022 3

https://github.com/wurichengn/node-nvrtc

一个简易的nodejs使用cuda的nvrtc功能的扩展

08 Jan 2022 4

blake3

SYCL accelerated BLAKE3 Hash Implementation

06 Jan 2022 10

cxbqn

BQN virtual machine

22 Oct 2021 29

gpufetch

Simple yet fancy GPU architecture fetching tool

10 Aug 2021 133

librapid

A highly optimised C++ library for mathematical applications and neural networks

25 May 2021 163

spbla

Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations

18 Feb 2021 14

proginfo

A small utility for getting some info post-hoc about a program's run

18 Feb 2021 7

LuisaCompute

High-Performance Rendering Framework on Stream Architectures

20 Nov 2020 636