Scripts for recreating the Replication Dataset for Fundamental Frequency Estimation. Part of the dissertation "Pitch of Voiced Speech in the Short-Time Fourier Transform". © 2020, Bastian Bechtold. All rights reserved.
GPL-3.0 License
Python audio and music signal processing library
A Python Tool for Analysis of Mouse Vocal Communication
Python API & command-line tool to easily transcribe speech-based video files into clean text
48-Channel Anechoic Audio Recordings of 3D Sources
基于PaddlePaddle实现的音频分类,支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型,还有多种预处理方法
Codebase of the submitted work in ICASSP 2023
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Program to benchmark various speech recognition APIs
Text-to-Audio/Music Generation
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to supp...
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
WaveNet vocoder
We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multi...
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
Core Engine of Singing Voice Conversion & Singing Voice Clone