LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
MIT License
LLaMA Server combines the power of LLaMA C++ (via PyLLaMACpp) with the beauty of Chatbot UI.
LLaMA C++ (via PyLLaMACpp) Chatbot UI LLaMA Server
UPDATE: Greatly simplified implementation thanks to the awesome Pythonic APIs of PyLLaMACpp 2.0.0!
UPDATE: Now supports better streaming through PyLLaMACpp!
UPDATE: Now supports streaming!
Get your favorite LLaMA models by
Create a models.yml
file to provide your model_home
directory and add your favorite South American camelids, e.g.:
model_home: /path/to/your/models
models:
llama-7b:
name: LLAMA-7B
path: 7B/ggml-model-q4_0.bin # relative to `model_home` or an absolute path
See models.yml for an example.
conda create -n llama python=3.9
conda activate llama
Install LLaMA Server:
python -m pip install llama-server
python -m pip install git+https://github.com/nuance1979/llama-server.git
Start LLaMA Server with your models.yml
file:
llama-server --models-yml models.yml --model-id llama-7b
git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev
.env.local
file and restart;cp .env.local.example .env.local
<edit .env.local to add your OPENAI_API_KEY>
llama-server --models-yml models.yml --model-id llama-13b # or any `model_id` defined in `models.yml`
export LLAMA_STREAM_MODE=0 # 1 to enable streaming
npm run dev
I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with ChatGPT; no more StackOverflow.