Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch
MIT License
A simple and working implementation of Electra, the fastest way to pretrain language models from ...
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Re...
Pipeline for training Language Models using PyTorch.
Adversarially Learned Inference in Pytorch
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
Minimalistic large language model 3D-parallelism training
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Modular Python implementation of encoder-only, decoder-only and encoder-decoder transformer archi...
Implementation of Fast Transformer in Pytorch
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning
An implementation of masked language modeling for Pytorch, made as concise and simple as possible
Implementation of Feedback Transformer in Pytorch
Temporary remove unused tokens during training to save ram and speed.