Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process with disentangled search and retrieval head aggregation, in Pytorch
MIT License
An implementation of Compositional Attention: Disentangling Search and Retrieval by MILA
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attenti...
Implementation of Slot Attention from GoogleAI
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Paramete...
Implementation of Nyström Self-attention, from the paper Nyströmformer
Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizin...
Implementation of Kronecker Attention in Pytorch
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Vi...
Some personal experiments around routing tokens to different autoregressive attention, akin to mi...
Implementation of Agent Attention in Pytorch
Implementation of an Attention layer where each head can attend to more than just one token, usin...
A simple cross attention that updates both the source and target in one step
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch
An implementation of (Induced) Set Attention Block, from the Set Transformers paper