Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"
MIT License
Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Fun...
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"
Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-E...
An implementation of Performer, a linear attention-based transformer, in Pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...
Exploring an idea where one forgets about efficiency and carries out attention across each edge o...
Implementation of Block Recurrent Transformer - Pytorch
Pytorch implementation of Compressive Transformers, from Deepmind
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
An implementation of local windowed attention for language modeling
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks...
Implementation of Agent Attention in Pytorch