MIT License
Home of StarCoder: fine-tuning & inference!
Code for paper Fine-tune BERT for Extractive Summarization
4 bits quantization of LLaMA using GPTQ
Running large language models on a single GPU for throughput-oriented scenarios.
A simple, performant and scalable Jax LLM!
End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 ...
LLM Inference benchmark
Home of StarCoder2!
Find better generation parameters for your LLM
Ongoing research training transformer language models at scale, including: BERT & GPT-2