Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
APACHE-2.0 License
A collection of Audio and Speech pre-trained models.
Unofficial PyTorch implementation of Google AI's VoiceFilter system
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
An Open Source text-to-speech system built by inverting Whisper.
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResN...
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE...
Fine-tune the Whisper speech recognition model to support training without timestamp data, traini...
本项目是基于Pytorch的语音合成项目,使用的是VITS,VITS是一种语音合成方法,这种时端到端的模型使用起来非常简单,不需要文本对齐等太复杂的流程,直接一键训练和生成,大大降低了学习门槛。
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
End-to-End Speech Processing Toolkit