Implementation of Infini-Transformer in Pytorch
MIT License
Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Re...
Implementation of Long-Short Transformer, combining local and global inductive biases for attenti...
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver ...
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and ...
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
Implementation of Block Recurrent Transformer - Pytorch
Implementation of Memformer, a Memory-augmented Transformer, in Pytorch
An implementation of local windowed attention for language modeling
An implementation of Transformer with Expire-Span, a circuit for learning which memories to retain
Implementation of Feedback Transformer in Pytorch
Implementation of Hierarchical Transformer Memory (HTM) for Pytorch
Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch
Implementation of Fast Transformer in Pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...