KuiperLLama

校招、秋招、春招、实习好项目，带你从零动手实现支持LLama的大模型推理框架。

Stars

191

Committers

View Code on GitHub

Ecosystems: Llama, Cuda

No README available, please check again later.

Related Projects

cuda-learning

cuda编程学习入门

02 Feb 2022 28

cccl

CUDA C++ Core Libraries

17 Sep 2020 743

awesome-gpgpu

A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources

20 Jun 2018 63

https://github.com/DefTruth/CUDA-Learn-Notes

🎉 Modern CUDA Learn Notes with PyTorch: fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, ...

17 Dec 2022 1,308

https://github.com/SamuraiBUPT/CUDA_Code

Codes for learning cuda. Implementation of multiple kernels.

11 Aug 2024 2

ScaleLLM

A high-performance inference system for large language models, designed for production environments.

24 Jul 2023 289

https://github.com/ACDSLab/MPPI-Generic

Templated C++/CUDA implementation of Model Predictive Path Integral Control (MPPI)

24 Jul 2024 19

https://github.com/lebedov/scikit-cuda

Python interface to GPU-powered libraries

27 Sep 2010 986

Sparky-2

This is a discord bot running on llama cpp with the llama 3 model and image geneartion

27 Apr 2024 5

Infero

An easy to use, high performant CUDA powered LLM inference library.

05 Jun 2024 12

https://github.com/neoheartbeats/neoheartbeats-kernel

An architecture for LLMs' continual-learning and long-term memories

26 Jul 2024 4

https://github.com/dancing-ui/uestc_vhm

使用yolov8、fast-reid、deepsort完成目标跟踪

10 Aug 2024 6

https://github.com/sony/nnabla-ext-cuda

A CUDA Extension of Neural Network Libraries

21 Jun 2017 92

https://github.com/Bruce-Lee-LY/decoding_attention

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the ...

14 Aug 2024 14

llama-dfdx

LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!

28 Apr 2023 100