Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch
MIT License
Implementation of the Equiformer, SE3/E3 equivariant attention network that reaches new SOTA, and...
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
Implementation of Fast Transformer in Pytorch
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Paramete...
Implementation of Block Recurrent Transformer - Pytorch
(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000...
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch
An implementation of local windowed attention for language modeling
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
Implementation of a Transformer, but completely in Triton
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver ...
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning
Implementation of the Point Transformer layer, in Pytorch