Implementation of the GBST block from the Charformer paper, in Pytorch
MIT License
GLM (General Language Model)
Implementation of Fast Transformer in Pytorch
An implementation of masked language modeling for Pytorch, made as concise and simple as possible
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
minimal pytorch implementation of bm25 (with sparse tensors)
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the se...
Implementation of Marge, Pre-training via Paraphrasing, in Pytorch
Implementation of Feedback Transformer in Pytorch
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Implementation of Block Recurrent Transformer - Pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...
Implementation of MagViT2 Tokenizer in Pytorch
A simple and working implementation of Electra, the fastest way to pretrain language models from ...
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Home of StarCoder2!