Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones
MIT License
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Vi...
Implementation of Nyström Self-attention, from the paper Nyströmformer
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attenti...
Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch
Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than...
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deforma...
A Pytorch implementation of Global Self-Attention Network, a fully-attention backbone for vision ...
An implementation of (Induced) Set Attention Block, from the Set Transformers paper
Implementation of Hierarchical Transformer Memory (HTM) for Pytorch
(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000...
Implementation of Kronecker Attention in Pytorch
Implementation of the Point Transformer layer, in Pytorch
A simple cross attention that updates both the source and target in one step
Implementation of Bottleneck Transformer in Pytorch
Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR ...