Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
MIT License
Implementation of Long-Short Transformer, combining local and global inductive biases for attenti...
Implementation of Block Recurrent Transformer - Pytorch
A simple cross attention that updates both the source and target in one step
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
An implementation of local windowed attention for language modeling
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch
GPT, but made only out of MLPs
Memory optimization and training recipes to extrapolate language models' context length to 1 mill...
Implementation of Perceiver, General Perception with Iterative Attention in TensorFlow
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
Implementation of Feedback Transformer in Pytorch
(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000...
Unofficial implementation of Perceiver IO