An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols
MIT License
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning
Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Fun...
Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer A...
Pytorch implementation of Compressive Transformers, from Deepmind
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and ...
Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks...
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper
An implementation of local windowed attention for language modeling
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch
Explorations into the recently proposed Taylor Series Linear Attention
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch