Code for the paper attend, copy, parse - End-to-end information extraction from documents (https://arxiv.org/pdf/1812.07248.pdf)
MIT License
Code for paper Attend, Copy, Parse - End-to-end information extraction from documents (https://arxiv.org/abs/1812.07248) by Rasmus Berg Palm, Ole Winther and Florian Laws.
tasks/parsing/data/{amounts,dates}/{train,valid}.tsv
following the format in the sample files.tasks/parsing/parser.py
: set the type
variable to train either a dates
or amounts
parser.PYTHONPATH="$PWD" python tasks/parsing/train.py
from the root of the repositorytasks/acp/data
. One document per file, following the format in the sample file.tasks/acp/splits
. One document per linefield
in AttendCopyParse
to train on different fields. Valid values are [number, order_id, date, total, tla, tta, tp]
PYTHONPATH="$PWD" python tasks/acp/train.py
from the root of the repositoryrestore_all_path
in tasks/acp/acp.py
to the saved model to restore weights from, e.g. ./snapshots/acp/best
.PYTHONPATH="$PWD" python tasks/acp/test.py
from the root of the repository./snapshots
/tmp/tensorboard
In order of difficulty
The scalar values logged
Text output samples logged
Attention distribution logged