hosting-7B-llm-on-google-cloud

Speed Benchmarking 7B LLM on different gcloud VMs

GPL-3.0 License

Stars

Committers

View Code on GitHub

Ecosystems: Llama, Python

Commit Statistics

Past Year

All Time

Total Commits

Total Committers

Avg. Commits Per Committer

4.5

Bot Commits

Issue Statistics

Past Year

All Time

Total Pull Requests

Merged Pull Requests

Total Issues

Time to Close Issues

N/A

Related Projects

neural-speed

An innovative library for efficient LLM inference via low-bit quantization

20 Nov 2023 179

LLM-Inference-Bench

29 Jul 2024 2

rungpt

An open-source cloud-native of large multi-modal models (LMMs) serving framework.

04 Apr 2023 151

Local_LLM_Deployment_Guide_Chinese

本地部署大语言模型的中文教学

09 May 2024 25

docs

Access 14k+ open source AI models across 30+ tasks with the Bytez inference API ✨

28 May 2024 2

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

09 Feb 2023 28,039

llm-demo

This repository demonstrates how to do inference with llama-2-7b-chat using llama.cpp on a machin...

11 Mar 2024 3

llm-api

Run any Large Language Model behind a unified API

02 Apr 2023 159

llama3.java

Practical Llama 3 inference in Java

25 Apr 2024 360

qwen2-in-a-lambda

Deploying Qwen2 (or any other GGUF models) into AWS Lambda

11 Sep 2024 0

Chinese-Llama-2-7b

开源社区第一个能下载、能运行的中文 LLaMA2 模型！

20 Jul 2023 2,225

llama-squad

Train Llama 2 & 3 on the SQuAD v2 task as an example of how to specialize a generalized (foundati...

29 Jul 2023 45

ollama-benchmark

LLM Benchmark for Throughput via Ollama (Local LLMs)

18 Jan 2024 24

chatbot

Chatbot Builds

25 Jul 2024 0

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable fo...

22 Jul 2023 1,967