run LLMs (llama, mamba, nemo, mistral) at native speeds from Javascript, Typescript.
Inference Llama 2 in one file of pure Zig
LLM inference in C/C++
JavaScript implementation of LiteLLM.
LLM-powered code documentation generation
Run large language models in Godot.
Claude artifacts. But with llama
LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!
Simple LLM library for JavaScript
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema ...
This repository contains a web application designed to execute relatively compact, locally-operat...
A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.
End-to-end scripting workflow to automatically generate show notes from audio/video transcripts w...
Tutorial on training, evaluating LLM, as well as utilizing RAG, Agent, Chain to build entertainin...
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
WebAssembly binding for llama.cpp - Enabling in-browser LLM inference