Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch
MIT License
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Implementation of Autoregressive Diffusion in Pytorch
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images an...
Audio generation using diffusion models, in PyTorch.
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-...
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Genera...
Implementation of MagViT2 Tokenizer in Pytorch
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch