vector-quantized-timbre

Implementation/Interpretation of Vector Quantized Timbre Representation (Bitton et al 2020)

Stars
10

Vector-Quantized Timbre Representation (Bitton et al. 2020)

Implementation/exploration of the paper Vector-Quantized Timbre Representation

Developed with Adán Benito early in 2021

  • This repository contains a Pytorch Lightning framework to train a model similar to the one described in the paper.
  • The main difference is the use of an exponential moving-average (EMA) codebook
  • A conventional VQ codebook caused the codebook to collapse perplexity = 1.

Setup

pip install -r requirements.txt
pip install -e .

Data preparation

Data splits and preprocessing are performed for the URMP western musical instrument dataset To split the URMP dataset into 3 second numpy files for efficient loading: python scripts/urmp_numpy_segments.py <path_to_URMP> <numpy_out_dir>

Then, set the URMP_DATA_DIR variable in gin_configs/vq_timbre.gin to <numpy_out_dir>

Training

python scripts/run_train.py

  • Most hyperparameters are configured with gin-config
  • set URMP.instr_ids= [<instrument_id>] to train for a target musical instrument
  • To log to wandb, enable lightning_run.logger = True in gin_configs/vq_timbre.gin

Evaluation

  • download model checkpoint for a model trained on Violin: https://drive.google.com/file/d/1fJ9bkM5eAuCNz4DClfeTm6b-IKjkTs0y/view?usp=sharing
  • put it in /checkpoints
  • jupyter notebook
  • see notebooks/eval_model.ipynb for simple timbre transfer and feature-based synthesis