git clone https://github.com/izuna385/Zero-Shot-Entity-Linking.git
cd Zero-Shot-Entity-Linking
python -m spacy download en_core_web_sm
# ~ Multiprocessing Sentence Boundary Detection takes about 2 hours under 8 core CPUs.
sh preprocessing.sh
python3 ./src/train.py -num_epochs 1
For further speednizing to check entire script, run the following command.
python3 ./src/train.py -num_epochs 1 -debug True
also, multi-gpu is supported.
CUDA_VISIBLE_DEVICES=0,1 python3 ./src/train.py -num_epochs 1 -cuda_devices 0,1
This experiments aim to confirm whether fine-tuning pretraind BERT (more specifically, encoders for mention and entity) is effective even to the unknown domains.
Following [Logeswaran et al., '19], entities are not shared between train-dev and train-test.
If you are interested in what this repository does, see the original paper, or unofficial slides.
torch
,allennlp
,transformers
, and faiss
are required. See also requirements.txt
.
~3 GB CPU and ~1.1GB GPU are necessary for running script.
Run sh preprocessing.sh
at this directory.
python3 ./src/train.py
See ./src/experiment_logdir/
.