PyTorch Dataset for Speech and Music audio
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
A PyTorch implementation of Location-Relative Attention Mechanisms For Robust Long-Form Speech Sy...
Audio Captioning datasets for PyTorch.
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training w...
Data manipulation and transformation for audio signal processing, powered by PyTorch
Models, data loaders and abstractions for language processing, powered by PyTorch
This project uses a variety of advanced voiceprint recognition models such as EcapaTdnn, ResNetSE...
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Unofficial PyTorch implementation of Google AI's VoiceFilter system
A collection of Audio and Speech pre-trained models.
Audio processing by using pytorch 1D convolution network
Pytorch implementation of Tacotron, a speech synthesis end-to-end generative TTS model.
ReconVAT: a semi-supervised automatic music transcription (AMT) model