Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
MIT License
Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Implementation of MagViT2 Tokenizer in Pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architectu...
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks,...
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Re...
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
A concise but complete implementation of CLIP with various experimental improvements from recent ...
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 min...
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch