Implementation of Nyström Self-attention, from the paper Nyströmformer
MIT License
Implementation of the Point Transformer layer, in Pytorch
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deforma...
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attenti...
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
Fast and memory-efficient exact attention
(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000...
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Vi...
Implementation of Flash Attention in Jax
Implementation of Kronecker Attention in Pytorch
Pytorch reimplementation of Molecule Attention Transformer, which uses a transformer to tackle th...
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
An implementation of local windowed attention for language modeling
An implementation of Performer, a linear attention-based transformer, in Pytorch
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch