Speech recognition with federated learning
Convolutional neural networks for Google speech commands data set with PyTorch.
Run federated_train_speech_commands_gpu.py
for simulating federated training with multiple clients, for example
python federated_train_speech_commands.py --model=vgg11 --optim=sgd --lr-scheduler=plateau --learning-rate=0.01 --lr-scheduler-patience=5 --max-epochs=1 --batch-size=156 --clients=2 --matrix-size=500
CUDA_VISIBLE_DEVICES=0,1 python federated_train_speech_commands_cpu_v5_collect_gradient.py --model=conv --optim=sgd --lr-scheduler=plateau --learning-rate=0.01 --lr-scheduler-patience=5 --max-epochs=1 --batch-size=128 --clients=3 --matrix-size=100 --num-threads=10
We, xuyuan and tugstugi, have participated in the Kaggle competition TensorFlow Speech Recognition Challenge and reached the 10-th place. This repository contains a simplified and cleaned up version of our team's code.
1x32x32
mel-spectrogram as network inputDue to time limit of the competition, we have trained most of the nets with sgd
using ReduceLROnPlateau
for 70 epochs.
For the training parameters and dependencies, see TRAINING.md. Earlier stopping the train process will sometimes produce a better score in Kaggle.
After the competition, some of the networks were retrained using mixup: Beyond Empirical Risk Minimization by Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin and David Lopez-Paz.