本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
本项目使用了EcapaTdnn、ResNetSE、ERes2Net、CAM++等多种先进的声纹识别模型,同时本项目也支持了MelSpectrogram、Spectrogram、MFCC、Fban...
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation too...
语音感情识别
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diff...
Wav2Lip UHQ extension for Automatic1111
Open source short video automatic generation tool
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Real time interactive streaming digital human
基于PaddlePaddle实现的音频分类,支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型,还有多种预处理方法
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system ...
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single ...
Depth-Aware Video Frame Interpolation (CVPR 2019)
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Genera...
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型