A simple package for Guided source separation (GSS)
MIT License
Parse and process the demixing secrets dataset (DSD100)
Zero-Mean Convolutions for Level-Invariant Singing Voice Detection
Ongoing research training transformer language models at scale, including: BERT & GPT-2
TristouNet: Triplet Loss for Speaker Turn Embedding
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for S...
A generative speech model for daily dialogue.
Generalized Minimal Distortion Principle for Blind Source Separation
Text-to-Audio/Music Generation
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Inference and training library for high-quality TTS models.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Fast algorithm for determined blind source separation with update of demixing filters with joint ...
Single channel speech source separation by diffusion process (ICASSP 2023)
A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse ra...
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models