Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
MIT License
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
A concise but complete implementation of CLIP with various experimental improvements from recent ...
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Re...
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 min...
A simple but complete full-attention transformer with a set of promising experimental features fr...
Implementation of E(n)-Transformer, which incorporates attention mechanisms into Welling's E(n)-E...
An implementation of Performer, a linear attention-based transformer, in Pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Implementation of the Equiformer, SE3/E3 equivariant attention network that reaches new SOTA, and...
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
DALL·E Mini - Generate images from a text prompt
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch