🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
OTHER License
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
A collection of Audio and Speech pre-trained models.
Это две нейросети соединённые одним модулем. Одна для распознавания, другая для генерация голоса.
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
An Open Source text-to-speech system built by inverting Whisper.
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE...
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training w...
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Sy...
A Web UI for easy subtitle using whisper model.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Official Implementation of Mockingjay in Pytorch
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Pytorch implementation of Tacotron, a speech synthesis end-to-end generative TTS model.
Unofficial PyTorch implementation of Google AI's VoiceFilter system