RWKV-v2-RNN trained on the Pile. See https://github.com/BlinkDL/RWKV-LM for details.
APACHE-2.0 License
Statistics for this project are still being loaded, please check back later.
LLaMA: Open and Efficient Foundation Language Models
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 20...
Home of StarCoder: fine-tuning & inference!
Some preliminary explorations of Mamba's context scaling.
Reproduce the results of "Neuroevolution of Self-Interpretable Agents" paper
Pipeline for training Language Models using PyTorch.
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners
Tensorflow implementation of contextualized word representations from bi-directional language models
Home of StarCoder2!
Experiments for XLM-V Transformers Integeration
Code for paper Fine-tune BERT for Extractive Summarization
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training