Stanford CoreNLP NER addon for Apache Tika's NamerEntityParser
GPL-3.0 License
This project supplies necessary resources to Apache Tika's NamedEntityParser
and demonstrates how to activate the NER implementation based on Stanford CoreNLP's CRF classifiers.
The usage of this addon has been documented at http://wiki.apache.org/tika/TikaAndNER#Using_Stanford_CoreNLP_NER
to get jar for dropping into tika's classpath
mvn clean compile assembly:single -PtikaAddon
To test :
mvn exec:java -Dexec.args=README.md
NOTE: README.md is a CLI argument
target/tika-ner-corenlp-addon-1.0-SNAPSHOT-jar-with-dependencies.jar
) and add it to tika's classpath (requires Tika 1.12).Alternatively, it is simple if your are using maven.
- Build and install this project to local maven repo by running
mvn install
on this project- Add this dependency to your project
xml <dependency> <groupId>edu.usc.ir.tika</groupId> <artifactId>tika-ner-corenlp</artifactId> <version>1.0-SNAPSHOT</version> </dependency>
Set system property ner.impl.class
to org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
.
An example usage is shown in test case NamedEntityParserTest.java
Activate org.apache.tika.parser.ner.NamedEntityParser
. An example configuration is at src/main/resources/tika-config.xml
tgowdan
at gmail.com