Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Functions, out of Google Deepmind
MIT License
A variant of Transformer-XL where the memory is updated not with a queue, but with attention
An implementation of local windowed attention for language modeling
Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks...
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
Implementation of the Llama architecture with RLHF + Q-learning
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and ...
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Implementation of Block Recurrent Transformer - Pytorch
Implementation of MagViT2 Tokenizer in Pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Re...
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-E...
An implementation of Performer, a linear attention-based transformer, in Pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...