
Convenience functions to finetune the XLSR-Wav2vec2 speech transcription model

APACHE-2.0 License


XLSR Finetune

A bunch of handy functions to make fine-tuning the XLSR-Wav2Vec2 speech recognition model much easier

Much of the code in this project is taken from Patrick Von Platen's brilliant Hugging Face blog post here, thanks to the team!

(Work in Progress)

Quickly Explore Your Audio Dataset

With a couple of lines you can upload some of all of your dataset to quickly explore your dataset. You can play the audio as well as sort and group columns

!pip install git+https://github.com/morganmcg1/xlsr_finetune.git

import wandb
from xlsr.wandbutils import *


explore = WandbDataExplorer(ds=test_ds, n_samples=100, 
                            artifact_name = 'my_new_artifact', artifact_type='audio_dataset',
                            table_name='explore_samples', wandb_project = 'xlsr',

The code above will generate this:

Demo Training Demo


This repo also includes an end-to-end training demo, based on the Hugging Face ASR blog, including showing how to filter your data, and save it to and download from Weigths & Biases Artifacts


git clone https://github.com/morganmcg1/xlsr_finetune.git

cd xlsr_finetune

pip install -e .

or you can run the below directly from a notebook

pip install git+https://github.com/morganmcg1/xlsr_finetune.git


To contribute, make sure you have the latest version of nbdev installed and check out the CONTRIBUTING.md file