AI powered speech denoising and enhancement
MIT License
Bot releases are hidden (Show)
Published by enhuiz 10 months ago
Versatile audio super resolution (any -> 48kHz) with AudioSR.
A collection of Audio and Speech pre-trained models.
Single channel speech source separation by diffusion process (ICASSP 2023)
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
WhisperPlus: Advancing Speech-to-Text Processing 🚀
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
How to use OpenAIs Whisper to transcribe and diarize audio files
AudioLDM: Generate speech, sound effects, music and beyond, with text.
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
http://www.facegood.cc
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Inference and training library for high-quality TTS models.
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)