Model for predicting categories of entities by its mentions
This repo contains AllenNLP model for prediction of Named Entity categories by its mentions.
You can generate some fake data using this Notebook
Filtered OneShotWikilinks dataset with manually selected categories.
category_graph.pkl
dbpedia_2016-10.owl
people_categories.json
people_categories.json
category_graph.pkl
projects/categories_prediction/manual_categories.gsheet
people_all_categories.json
people_mentions.tsv
Prepare splitted data with:
!split -n l/10 --verbose ../data/fake_data_train.tsv ../data/fake_data_train.tsv_
pip install -r requirements.txt
rm -rf ./data/vocabulary ; allennlp make-vocab -s ./data/ allen_conf_vocab.json --include-package category_prediction
allennlp train -f -s data/stats allen_conf.json --include-package category_prediction
allennlp train -f -s data/stats allen_conf.json --include-package category_prediction -o '{"trainer": {"cuda_device": 0}}'
rm -rf data/stats2/ # Clear new serialization dir
allennlp fine-tune -s data/stats2/ -c allen_conf.json -m ./data/stats/model.tar.gz --include-package category_prediction -o '{"trainer": {"cuda_device": 0}, "iterator": {"base_iterator": {"batch_size": 64}}}'
allennlp evaluate ./data/stats/model.tar.gz ./data/fake_data_test.tsv --include-package category_prediction
MODEL=./data/trained_models/6th_augmented/model.tar.gz python run_server.py
gunicorn -c gunicorn_config.py wsgi:application
Build
cd docker
docker build --tag mention .
Run with passing pyenv into container
docker run --rm --restart unless-stopped -v $HOME:$HOME -p 8000:8000 \
-v $HOME/.pyenv:/root/.pyenv \
-e ENV_PATH=$HOME/virtualenv/path \
-e APP_PATH=$HOME/project/root/path mention
Fix 100% GPU utilization
sudo nvidia-smi -pm 1