Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
MIT License
Implementation of NWT, audio-to-video generation, in Pytorch
PyTorch extensions for fast R&D prototyping and Kaggle farming
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Py...
Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the se...
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
Implementation of the Hybrid Perception Block and Dual-Pruned Self-Attention block from the ITTR ...
Implementation of the GBST block from the Charformer paper, in Pytorch
Implementation of MagViT2 Tokenizer in Pytorch
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in ...
Implementation of Zorro, Masked Multimodal Transformer, in Pytorch
Implementation of Fast Transformer in Pytorch
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners
Implementation of Transformer in Transformer, pixel level attention paired with patch level atten...
Implementation of ResMLP, an all MLP solution to image classification, in Pytorch