SincNet is a neural architecture for efficiently processing raw audio samples.
MIT License
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Final project for the Speaker Recognition course on Udemy, 机器之心, 深蓝学院 and 语音之家
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training w...
The Pytorch implementation of sound classification supports EcapaTdnn, PANNS, TDNN, Res2Net, ResN...
CTC end -to-end ASR for timit and 863 corpus.
End-to-End Speech Processing Toolkit
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition syst...
WaveNet vocoder
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Pytorch implementation of Tacotron, a speech synthesis end-to-end generative TTS model.
Unofficial PyTorch implementation of Google AI's VoiceFilter system
Foundational model for human-like, expressive TTS