Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
MIT License
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling wit...
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Implementation of MagViT2 Tokenizer in Pytorch
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net o...
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 min...
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
DALL·E Mini - Generate images from a text prompt
Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch