A fast llama2 decoder in pure Rust.
MIT License
Statistics for this project are still being loaded, please check back later.
Yet another `llama.cpp` Rust wrapper
LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!
An ecosystem of Rust libraries for working with large language models
Unofficial python bindings for the rust llm library. 🐍❤️🦀
Efficent platform for inference and serving local LLMs including an OpenAI compatible API server.