wiki-entity-summarization-preprocessor

Convert Wikidata and Wikipedia raw files to filterable formats with a focus of marking Wikidata as summaries based on their Wikipedia abstracts.

CC-BY-4.0 License

Stars
1
Committers
1

Bot releases are hidden (Show)

wiki-entity-summarization-preprocessor - pg-05012024 Latest Release

Published by msorkhpar 5 months ago

Here is a dump of our preprocessor's Postgres database. This version uses Wikidata and Wikipedia article pages published on 05/01/2023.

After downloading the files, wiki-es-pg.bck.xz{00-19} , you can load them to a Postgres instance by executing the following commands.

cat wiki-es-pg.bck.xz* > /data/pg-data/wiki-es-pg.bck.xz
xz -d wiki-es-pg.bck.xz
PGCLIENTENCODING=SQL_ASCII pg_restore -v -Fc --no-owner --no-acl -U $DB_USER -n public -d $DB_NAME /data/pg-data/wiki-es-pg.bck
wiki-entity-summarization-preprocessor - neo-05012024

Published by msorkhpar 5 months ago

Here is a dump of our preprocessor's Neo4j database. This version uses Wikidata and Wikipedia article pages published on 05/01/2023.

After downloading the files, you can load them to a neo4j instance by executing the following command.

cat neo4j.part* > neo4j-dump.rar
unrar x neo4j-dump.rar /data/neo4j-data/
neo4j-admin database load --from-path=/data/neo4j-data/neo4j.dump --force