Silero VAD: pre-trained enterprise-grade Voice Activity Detector
MIT License
Multilingual Voice Understanding Model
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
so-vits-svc fork with realtime support, improved interface and more features.
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarr...
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training w...
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Foundational model for human-like, expressive TTS
Unofficial PyTorch implementation of Google AI's VoiceFilter system
A collection of Audio and Speech pre-trained models.
An Open Source text-to-speech system built by inverting Whisper.
End-to-End Speech Processing Toolkit
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE...