ringattention

Transformers with Arbitrarily Large Context

APACHE-2.0 License

Stars

627

View Code on GitHub

Ecosystems: Python

Statistics for this project are still being loaded, please check back later.

Related Projects

flash-attention

Fast and memory-efficient exact attention

19 May 2022 11,791

agent-attention-pytorch

Implementation of Agent Attention in Pytorch

18 Dec 2023 85

adjacent-attention-network

Graph neural network message passing reframed as a Transformer with local attention

10 Dec 2020 65

PaLM-rlhf-pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...

09 Dec 2022 7,595

ring-attention-pytorch

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

14 Feb 2024 457

local-attention

An implementation of local windowed attention for language modeling

05 Jul 2020 375

performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch

03 Oct 2020 1,084

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 mill...

05 Apr 2024 574

block-recurrent-transformer-pytorch

Implementation of Block Recurrent Transformer - Pytorch

07 Feb 2023 212

MEGABYTE-pytorch

Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...

15 May 2023 620

complex-valued-transformer

Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer A...

06 Oct 2023 57

En-transformer

Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-E...

27 Feb 2021 208

dilated-attention-pytorch

(Unofficial) Implementation of dilated attention from "LongNet: Scaling Transformers to 1,000,000...

09 Jul 2023 47

memory-transformer-xl

A variant of Transformer-XL where the memory is updated not with a queue, but with attention

10 Jul 2020 45

gateloop-transformer

Implementation of GateLoop Transformer in Pytorch and Jax

06 Nov 2023 86