Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
MIT License
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
End-to-End Speech Processing Toolkit
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResN...
An Open Source text-to-speech system built by inverting Whisper.
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training w...
Quasi-Periodic Parallel WaveGAN Pytorch implementation
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
ICASSP 2023 Accepted
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE...
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Foundational model for human-like, expressive TTS
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
WaveNet vocoder
so-vits-svc fork with realtime support, improved interface and more features.
Stable Diffusion web UI