A minimal nmt example to serve as an seq2seq+attention reference.
MIT License
NEURAL MACHINE TRANSLATION BY JOINTLY LEARNING TO ALIGN AND TRANSLATE https://arxiv.org/pdf/1409.0473.pdf
Effective Approaches to Attention-based Neural Machine Translation https://arxiv.org/pdf/1508.04025.pdf
Massive Exploration of Neural Machine Translation Architectures https://arxiv.org/pdf/1703.03906.pdf
conda install pytorch -c pytorch=0.4.1
pip install -r requirements.txt
Training with a batch size of 32 takes ~3gb GPU ram.
If this is too much, lower the batch size or reduce network dimensionality in hyperparams.py
.
python train.py
view logs in Tensorboard decent alignments should be seen after 2-3 epochs.
tensorboard --logdir runs
(partially trained attention heatmap)