Self-supervised VQ-VAE for One-Shot Music Style Transfer
APACHE-2.0 License
This is the code repository for the ICASSP 2021 paper Self-Supervised VQ-VAE for One-Shot Music Style Transfer by Ondej Cfka, Alexey Ozerov, Umut imekli, and Gal Richard.
Copyright 2020 InterDigital R&D and Tlcom Paris.
🔬 Paper preprint [pdf] 🎵 Supplementary website with audio examples 🎤 Demo notebook 🧠Trained model parameters (212 MB)
src
the main codebase (the ss-vq-vae
package); install with pip install ./src
; usage details below
data
Jupyter notebooks for data preparation (details below)experiments
model configuration, evaluation, and other experimental stuffpip install -r requirements.txt
pip install ./src
To train the model, go to experiments
, then run:
python -m ss_vq_vae.models.vqvae_oneshot --logdir=model train
This is assuming the training data is prepared (see below).
To run the trained model on a dataset, substitute run
for train
and specify the input and output paths as arguments (use run --help
for more information).
Alternatively, see the colab_demo.ipynb
notebook for how to run the model from Python code.
Each dataset used in the paper has a corresponding directory in data
, containing a Jupyter notebook called prepare.ipynb
for preparing the dataset:
data/comb
; combined from LMD and RT (see below)data/lmd/audio_train
data/lmd/audio_test
data/lmd/note_seq/prepare.ipynb
FluidR3_GM.sf2
, TimGM6mb.sf2
, Arachno SoundFont - Version 1.0.sf2
, Timbres Of Heaven (XGM) 3.94.sf2
data/rt
data/mixing_secrets/test
data/mixing_secrets/metric_train
data/mixing_secrets/download.ipynb
This work has received funding from the European Unions Horizon 2020 research and innovation programme under the Marie Skodowska-Curie grant agreement No. 765068.