https://github.com/Bruce-Lee-LY/decoding_attention

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the decoding stage of LLM inference.

BSD-3-CLAUSE License

Stars

View Code on GitHub

Ecosystems: Cuda

Statistics for this project are still being loaded, please check back later.

Related Projects

https://github.com/dancing-ui/uestc_vhm

使用yolov8、fast-reid、deepsort完成目标跟踪

10 Aug 2024 6

cccl

CUDA C++ Core Libraries

17 Sep 2020 743

https://github.com/jamjamjon/usls

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-L...

29 Mar 2024 34

jpeggpu

Low-latency CUDA JPEG decoder by parallelizing Huffman decoding

20 Nov 2023 5

Apriori-and-Eclat-Frequent-Itemset-Mining

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mi...

21 Oct 2018 40

https://github.com/kwea123/ngp_pl

Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with on...

30 Jun 2022 1,209

https://github.com/js1010/cusim

Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)

02 Feb 2021 43

https://github.com/DefTruth/CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, ...

17 Dec 2022 1,308

https://github.com/mp3guy/ElasticFusion

Real-time dense visual SLAM system

22 Oct 2015 1,772

https://github.com/mp3guy/Kintinuous

Real-time large scale dense visual SLAM system

22 Oct 2015 913

ScaleLLM

A high-performance inference system for large language models, designed for production environments.

24 Jul 2023 289

awesome-gpgpu

A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources

20 Jun 2018 63

cuda-design-patterns

Some CUDA design patterns and a bit of template magic for CUDA

16 Nov 2018 145

llama-dfdx

LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

28 Apr 2023 100

https://github.com/neoheartbeats/neoheartbeats-kernel

An architecture for LLMs' continual-learning and long-term memories

26 Jul 2024 4