Reproducing code for Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders
MIT License
For reproducing the paper Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders.
The repo is based on the (old) project template.
data
at the root of this repo.python train.py -c config.json
The checkpoint model_best.pth
will be saved at saved/gmvae-synth
.
After the training completes,
play with ismir19-217-sup-material.ipynb
to see the results.
A pitch classifier which takes as input the pitch latent variable is added on top of the pitch space.
spec
and spec-norm
refer to the extracted mel-spectrograms and the normalized ones.config.json
refers to the fully-supervised model in the paper,config.json
, change the label_portion
under the trainer
tag to train a semi-supervised model.Please kindly cite the paper as follows if you find it useful.
@inproceedings{DBLP:conf/ismir/LuoAH19,
author = {Yin{-}Jyun Luo and
Kat Agres and
Dorien Herremans},
editor = {Arthur Flexer and
Geoffroy Peeters and
Juli{\'{a}}n Urbano and
Anja Volk},
title = {Learning Disentangled Representations of Timbre and Pitch for Musical
Instrument Sounds Using Gaussian Mixture Variational Autoencoders},
booktitle = {Proceedings of the 20th International Society for Music Information
Retrieval Conference, {ISMIR} 2019, Delft, The Netherlands, November
4-8, 2019},
pages = {746--753},
year = {2019},
url = {http://archives.ismir.net/ismir2019/paper/000091.pdf},
timestamp = {Thu, 12 Mar 2020 11:32:59 +0100},
biburl = {https://dblp.org/rec/conf/ismir/LuoAH19.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}