llama-server

LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.

MIT License

Downloads
283
Stars
111
Committers
1

LLaMA Server

LLaMA Server combines the power of LLaMA C++ (via PyLLaMACpp) with the beauty of Chatbot UI.

LLaMA C++ (via PyLLaMACpp) Chatbot UI LLaMA Server

UPDATE: Greatly simplified implementation thanks to the awesome Pythonic APIs of PyLLaMACpp 2.0.0!

UPDATE: Now supports better streaming through PyLLaMACpp!

UPDATE: Now supports streaming!

Demo

  • Better Streaming

https://user-images.githubusercontent.com/10931178/231539194-052f7c5f-c7a3-42b7-9f8b-142422e42a67.mov

  • Streaming

https://user-images.githubusercontent.com/10931178/229980159-61546fa6-2985-4cdc-8230-5dcb6a69c559.mov

  • Non-streaming

https://user-images.githubusercontent.com/10931178/229408428-5b6ef72d-28d0-427f-ae83-e23972e2dcff.mov

Setup

  • Get your favorite LLaMA models by

    • Download from Hugging Face;
    • Or follow instructions at LLaMA C++;
    • Make sure models are converted and quantized;
  • Create a models.yml file to provide your model_home directory and add your favorite South American camelids, e.g.:

model_home: /path/to/your/models
models:
  llama-7b:
    name: LLAMA-7B
    path: 7B/ggml-model-q4_0.bin  # relative to `model_home` or an absolute path

See models.yml for an example.

  • Set up python environment:
conda create -n llama python=3.9
conda activate llama
  • Install LLaMA Server:

    • From PyPI:
    python -m pip install llama-server
    
    • Or from source:
    python -m pip install git+https://github.com/nuance1979/llama-server.git
    
  • Start LLaMA Server with your models.yml file:

llama-server --models-yml models.yml --model-id llama-7b
  • Check out my fork of Chatbot UI and start the app;
git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev
  • Open the link http://localhost:3000 in your browser;
    • Click "OpenAI API Key" at the bottom left corner and enter your OpenAI API Key;
    • Or follow instructions at Chatbot UI to put your key into a .env.local file and restart;
    cp .env.local.example .env.local
    <edit .env.local to add your OPENAI_API_KEY>
    
  • Enjoy!

More

  • Try a larger model if you have it:
llama-server --models-yml models.yml --model-id llama-13b  # or any `model_id` defined in `models.yml`
  • Try non-streaming mode by restarting Chatbot UI:
export LLAMA_STREAM_MODE=0  # 1 to enable streaming
npm run dev

Fun facts

I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with ChatGPT; no more StackOverflow.

Package Rankings
Top 18.72% on Pypi.org
Badges
Extracted from project README's
PyPI version Unit test GitHub stars GitHub license
Related Projects