State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
MIT License
A PyTorch library and evaluation platform for end-to-end compression research
Data manipulation and transformation for audio signal processing, powered by PyTorch
AudioLDM: Generate speech, sound effects, music and beyond, with text.
A JAX Implementation of the Descript Audio Codec
Multi-lingual large voice generation model, providing inference, training and deployment full-sta...
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model propose...
Audiocraft is a library for audio processing and generation with deep learning. It features the s...
Home of StarCoder: fine-tuning & inference!
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Audio super resolution using neural networks
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Contrastive Language-Audio Pretraining
DCASE2024 Challenge Task 6 baseline system (Automated Audio Captioning)
Inference and training library for high-quality TTS models.
Single channel speech source separation by diffusion process (ICASSP 2023)