Synthesis of percussion sounds using sinusoidal modelling, DDSP noise synthesis, and a neural source filter approach.
Easiest way of fine-tuning HuggingFace video classification models
Generative models for conditional audio generation
AeCC: Autoencoders for Compressed Communication
DCASE2024 Challenge Task 6 baseline system (Automated Audio Captioning)
本项目基于PaddleDetection目标检测开发套件,选取1.3M超轻量PPYOLO tiny进行项目开发,并部署于windows端。
We present NoticIA, a dataset consisting of 850 Spanish news articles featuring prominent clickba...
Official PyTorch implementation of Contrastive Learning of Musical Representations
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Abandoned bag detection using multi dataset training with COCO and ADE20K
Zero-Mean Convolutions for Level-Invariant Singing Voice Detection
PyTorch implementation of Tacotron speech synthesis model.