Codes for "Towards Binary-Valued Gates for Robust LSTM Training".
Video+code lecture on building nanoGPT from scratch
GPT, but made only out of MLPs
Official repository of the xLSTM.
Sequence to Sequence from Scratch Using Pytorch
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Code for training and evaluation of the model from "Language Generation with Recurrent Generative...
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Pipeline for training Language Models using PyTorch.
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated Stat...
minichatgpt - To Train ChatGPT In 5 Minutes
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Minimal, clean example of lstm neural network training in python, for learning purposes.