Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
OTHER License
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training w...
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE...
An Open Source text-to-speech system built by inverting Whisper.
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
This is where I put things I find useful that speed up my work with Machine Learning. Ever looked...
Multilingual Voice Understanding Model
End-to-End Speech Processing Toolkit
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
so-vits-svc fork with realtime support, improved interface and more features.
Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文...
Generate images from texts. In Russian
LLM (Large Language Model) FineTuning
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)