Open Source Ecosystems

Ollama has advantages for prompting a knowledge base offline but at times can get very slow. It takes too long for it to be useful in a chatbot app.

Response time average per model locally:

llama-3-405b: 30 seconds
mistral-nemo: > 4 minutes

Using cloud deployment offloads all the computation.

Response time average per model on cloud:

llama-3-405b-instruct: 3 seconds

The amount of options to pick and choose on cloud is amazing. It gets expensive to scale, but using a free trial you can test with small token amounts on IBM Cloud.

Other available models:

Model	Path
MT0_XXL	bigscience/mt0-xxl
CODELLAMA_34B_INSTRUCT_HF	codellama/codellama-34b-instruct-hf
FLAN_T5_XL	google/flan-t5-xl
FLAN_T5_XXL	google/flan-t5-xxl
FLAN_UL2	google/flan-ul2
MERLINITE_7B	ibm-mistralai/merlinite-7b
GRANITE_13B_CHAT_V2	ibm/granite-13b-chat-v2
GRANITE_13B_INSTRUCT_V2	ibm/granite-13b-instruct-v2
GRANITE_20B_CODE_INSTRUCT	ibm/granite-20b-code-instruct
GRANITE_20B_MULTILINGUAL	ibm/granite-20b-multilingual
GRANITE_34B_CODE_INSTRUCT	ibm/granite-34b-code-instruct
GRANITE_3B_CODE_INSTRUCT	ibm/granite-3b-code-instruct
GRANITE_7B_LAB	ibm/granite-7b-lab
GRANITE_8B_CODE_INSTRUCT	ibm/granite-8b-code-instruct
LLAMA_2_13B_CHAT	meta-llama/llama-2-13b-chat
LLAMA_2_70B_CHAT	meta-llama/llama-2-70b-chat
LLAMA_3_405B_INSTRUCT	meta-llama/llama-3-405b-instruct
LLAMA_3_70B_INSTRUCT	meta-llama/llama-3-70b-instruct
LLAMA_3_8B_INSTRUCT	meta-llama/llama-3-8b-instruct
MISTRAL_LARGE	mistralai/mistral-large
MIXTRAL_8X7B_INSTRUCT_V01	mistralai/mixtral-8x7b-instruct-v01

Related Projects

nvim-llama

🦙 Ollama interfaces for Neovim

26 Aug 2023 251

hosting-7B-llm-on-google-cloud

Speed Benchmarking 7B LLM on different gcloud VMs

22 Jul 2024 0

open-llm-webui

This repository contains a web application designed to execute relatively compact, locally-operat...

17 May 2023 39

query-llm

Query LLM with Chain-of-Tought

22 Jun 2024 3

Play-with-LLMs

Tutorial on training, evaluating LLM, as well as utilizing RAG, Agent, Chain to build entertainin...

24 Jun 2023 510

aikit

🏗️ Fine-tune, build, and deploy open-source LLMs easily!

20 Sep 2023 164

docker-llama2-chat

Play LLaMA2 (official / 中文版 / INT4 / llama2.cpp) Together! ONLY 3 STEPS! ( non GPU / 5GB vRAM / 8...

19 Jul 2023 533

Llama-Chinese

Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用

19 Jul 2023 12,154

llama-server

LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.

03 Apr 2023 111

docs

Access 14k+ open source AI models across 30+ tasks with the Bytez inference API ✨

28 May 2024 2

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable fo...

22 Jul 2023 1,967

libre-chat

🦙 Free and Open Source Large Language Model (LLM) chatbot web UI and API. Self-hosted, offline ca...

26 Jul 2023 128

Get-Things-Done-with-Prompt-Engineering-and-LangChain

LangChain & Prompt Engineering tutorials on Large Language Models (LLMs) such as ChatGPT with cus...

12 Apr 2023 1,094

easy-llms

Easy "1-line" calling of all LLMs from OpenAI, MS Azure, AWS Bedrock, GCP Vertex, and Ollama

05 Jun 2024 43

Local_LLM_Deployment_Guide_Chinese

本地部署大语言模型的中文教学

09 May 2024 25