Open Source Ecosystems

Compositional Attention - Pytorch

Implementation of Compositional Attention from MILA. They reframe the "heads" of multi-head attention as "searches", and once the multi-headed/searched values are aggregated, there is an extra retrieval step (using attention) off the searched results. They then show this variant of attention yield better OOD results on a toy task. Their ESBN results still leaves a lot to be desired, but I like the general direction of the paper.

Install

$ pip install compositional-attention-pytorch

Usage

import torch
from compositional_attention_pytorch import CompositionalAttention

attn = CompositionalAttention(
    dim = 1024,            # input dimension
    dim_head = 64,         # dimension per attention 'head' - head is now either search or retrieval
    num_searches = 8,      # number of searches
    num_retrievals = 2,    # number of retrievals
    dropout = 0.,          # dropout of attention of search and retrieval
)

tokens = torch.randn(1, 512, 1024)  # tokens
mask = torch.ones((1, 512)).bool()  # mask

out = attn(tokens, mask = mask) # (1, 512, 1024)

Citations

@article{Mittal2021CompositionalAD,
    title   = {Compositional Attention: Disentangling Search and Retrieval},
    author  = {Sarthak Mittal and Sharath Chandra Raparthy and Irina Rish and Yoshua Bengio and Guillaume Lajoie},
    journal = {ArXiv},
    year    = {2021},
    volume  = {abs/2110.09419}
}

Package Rankings

Top 23.64% on Pypi.org

Related Projects

perceiver-pytorch

Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch

05 Mar 2021 1,049

AoA-pytorch

A Pytorch implementation of Attention on Attention module (both self and guided variants), for Vi...

07 Nov 2020 40

slot-attention

Implementation of Slot Attention from GoogleAI

29 Jun 2020 384

memory-compressed-attention

Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizin...

25 Jul 2020 71

make-a-video-pytorch

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch

29 Sep 2022 1,852

isab-pytorch

An implementation of (Induced) Set Attention Block, from the Set Transformers paper

26 Oct 2020 53

kronecker-attention-pytorch

Implementation of Kronecker Attention in Pytorch

27 Aug 2020 17

nystrom-attention

Implementation of Nyström Self-attention, from the paper Nyströmformer

11 Feb 2021 121

mixture-of-attention

Some personal experiments around routing tokens to different autoregressive attention, akin to mi...

21 Apr 2023 101

Compositional-Attention

An implementation of Compositional Attention: Disentangling Search and Retrieval by MILA

31 May 2022 14

bidirectional-cross-attention

A simple cross attention that updates both the source and target in one step

27 Mar 2022 145

halonet-pytorch

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Paramete...

24 Mar 2021 200

agent-attention-pytorch

Implementation of Agent Attention in Pytorch

18 Dec 2023 85

memory-efficient-attention-pytorch

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attenti...

03 Mar 2022 356

coordinate-descent-attention

Implementation of an Attention layer where each head can attend to more than just one token, usin...

31 Mar 2023 46