FastMLX is a high performance production ready API to host MLX models.
OTHER License
Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
Run any Large Language Model behind a unified API
Chat language model that can use tools and interpret the results
RayLLM - LLMs on Ray
LLaMA: Open and Efficient Foundation Language Models
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework.
SGLang is a structured generation language designed for large language models (LLMs). It makes yo...
Python bindings for llama.cpp
Plugin for LLM adding support for the GPT4All collection of models
MLX-Embeddings is the best package for running Vision and Language Embedding models locally on yo...
Explore large language models in 512MB of RAM