本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,直接一键训练和生成,大大降低了学习门槛。
APACHE-2.0 License
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Fine-tune the Whisper speech recognition model to support training without timestamp data, traini...
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
This is a pytorch repository of YOLOv4, attentive YOLOv4 and mobilenet YOLOv4 with PASCAL VOC and...
chatglm 6b finetuning and alpaca finetuning
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
so-vits-svc fork with realtime support, improved interface and more features.
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE...
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResN...
End-to-End Speech Processing Toolkit