LLaMA Server

LLaMA Server combines the power of LLaMA C++ (via PyLLaMACpp) with the beauty of Chatbot UI.

LLaMA C++ (via PyLLaMACpp) Chatbot UI LLaMA Server

UPDATE: Greatly simplified implementation thanks to the awesome Pythonic APIs of PyLLaMACpp 2.0.0!

UPDATE: Now supports better streaming through PyLLaMACpp!

UPDATE: Now supports streaming!

Demo

Better Streaming

https://user-images.githubusercontent.com/10931178/231539194-052f7c5f-c7a3-42b7-9f8b-142422e42a67.mov

Streaming

https://user-images.githubusercontent.com/10931178/229980159-61546fa6-2985-4cdc-8230-5dcb6a69c559.mov

Non-streaming

https://user-images.githubusercontent.com/10931178/229408428-5b6ef72d-28d0-427f-ae83-e23972e2dcff.mov

Setup

Get your favorite LLaMA models by
- Download from Hugging Face;
- Or follow instructions at LLaMA C++;
- Make sure models are converted and quantized;
Create a models.yml file to provide your model_home directory and add your favorite South American camelids, e.g.:

model_home: /path/to/your/models
models:
  llama-7b:
    name: LLAMA-7B
    path: 7B/ggml-model-q4_0.bin  # relative to `model_home` or an absolute path

See models.yml for an example.

Set up python environment:

conda create -n llama python=3.9
conda activate llama

Install LLaMA Server:

From PyPI:

python -m pip install llama-server

Or from source:

python -m pip install git+https://github.com/nuance1979/llama-server.git

Start LLaMA Server with your models.yml file:

llama-server --models-yml models.yml --model-id llama-7b

Check out my fork of Chatbot UI and start the app;

git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev

Open the link http://localhost:3000 in your browser;
- Click "OpenAI API Key" at the bottom left corner and enter your OpenAI API Key;
- Or follow instructions at Chatbot UI to put your key into a .env.local file and restart;
```
cp .env.local.example .env.local
<edit .env.local to add your OPENAI_API_KEY>
```
Enjoy!

Try a larger model if you have it:

llama-server --models-yml models.yml --model-id llama-13b  # or any `model_id` defined in `models.yml`

Try non-streaming mode by restarting Chatbot UI:

export LLAMA_STREAM_MODE=0  # 1 to enable streaming
npm run dev

Fun facts

I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with ChatGPT; no more StackOverflow.

Package Rankings

Top 18.72% on Pypi.org

Badges

Extracted from project README's

Related Projects

node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema ...

12 Aug 2023 905

llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llam...

20 Jul 2023 1,960

realtime-bakllava

llama.cpp with BakLLaVA model describes what does it see

04 Nov 2023 378

libre-chat

🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline ca...

26 Jul 2023 128

llama.cpp-ts

llama.cpp 🦙 LLM inference in TypeScript

27 Jul 2024 0

llama-gpt

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leav...

22 Jul 2023 10,740

botality-ii

telegram bot for self-hosted local inference of stable diffusion, text-to-speech and large langua...

11 Mar 2023 37

Llama-Chinese

Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用

19 Jul 2023 12,154

LlamaChat

Chat with your favourite LLaMA models in a native macOS app

26 Mar 2023 1,453

baby-code

100% Private & Simple. OSS 🐍 Code Interpreter for LLMs 🦙

23 Jul 2023 34

llama3-wrapper

Node Llama Cpp wrapper for Node JS

27 Apr 2024 1

llm_server

Rack API application for Llama.cpp

15 Jun 2023 40

llavavision

A simple "Be My Eyes" web app with a llama.cpp/llava backend

06 Nov 2023 482

llama.go

llama.go is like llama.cpp in pure Golang!

19 Mar 2023 1,245

llama

Inference code for Llama models

14 Feb 2023 53,568

llama-server