VITS-based Voice Conversion focused on simplicity, quality and performance.
MIT License
Data manipulation and transformation for audio signal processing, powered by PyTorch
A Web UI for easy subtitle using whisper model.
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Multilingual Voice Understanding Model
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
An Open Source text-to-speech system built by inverting Whisper.
A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
so-vits-svc fork with realtime support, improved interface and more features.
GUI for a Vocal Remover that uses Deep Neural Networks.
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training w...
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
The best gradio web-ui for ai subtitle, translation and dubbing. Automatic subtitle creation usin...
Foundational model for human-like, expressive TTS