1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
MIT License
A generative speech model for daily dialogue.
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fban...
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to supp...
Core Engine of Singing Voice Conversion & Singing Voice Clone
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system ...
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成,支持多语言,准确率高
基于PaddlePaddle实现的音频分类,支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型,还有多种预处理方法
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in h...
TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including ...
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ...
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's...
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组