Find better generation parameters for your LLM
APACHE-2.0 License
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA a...
My Digital Palace - A Personal Journal for Reflection - A place to store all my thoughts
Chain-of-Hindsight, A Scalable RLHF Method
Ongoing research training transformer models at scale
Home of StarCoder: fine-tuning & inference!
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critiqu...
Home of StarCoder2!
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!
Transformer with Untied Positional Encoding (TUPE). Code of paper "Rethinking Positional Encoding...
Unsupervised Language Modeling at scale for robust sentiment classification
Transformer models implementation for training from scratch.
Hybrid Discriminative-Generative Training via Contrastive Learning
Generate textbook-quality synthetic LLM pretraining data