A python package to build AI-powered real-time audio applications
MIT License
Fast algorithm for determined blind source separation with update of demixing filters with joint ...
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system ...
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to supp...
How to use OpenAIs Whisper to transcribe and diarize audio files
In defence of metric learning for speaker recognition
Inference and training library for high-quality TTS models.
Program to benchmark various speech recognition APIs
Python re-implementation of the (constrained) spectral clustering algorithms used in Google's spe...
Single channel speech source separation by diffusion process (ICASSP 2023)
Text-to-Audio/Music Generation
A generative speech model for daily dialogue.
WhisperPlus: Advancing Speech-to-Text Processing 🚀
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence