a pretty-committed wikipedia markup parser
MIT License
Published by spencermountain almost 3 years ago
Published by spencermountain almost 3 years ago
thank you @wvanderp
Published by spencermountain about 3 years ago
Published by spencermountain over 3 years ago
Tldr:
.templates()
now return Template objects, instead of json..fetch()
detail:
[breaking] - .templates() now returns Template objects, like other methods (call .json())
[breaking] - change interpretation of reversed params in .fetch() method (thanks wouter!)
[breaking] - change params for custom templates
[breaking] - move .random() and .category() to plugin-api
[breaking] - always return an array for plural methods, even with number param, like .links(3)
[possibly-breaking] - cleanup null|undefined responses from methods
[possibly-breaking] - remove .dates() method (prev deprecated)
[possibly-breaking] - require node 10, ie > 11
[change] - normalize table rows
[change] - move wiktionary templates to wtf-plugin-wiktionary
[change] - Link.text() now returns page
[change] - improvements to 'soft' isDisambiguation
detection
[change] - deprecate wtf-plugin-category (move to wtf-plugin-api)
[new] - api plugin
[new] - disambig plugin
[new] - person plugin
[new] - Table.get() method
[new] - set new infoboxes using .extend()
plugin-api 0.0.1
plugin-classify 1.0.0
plugin-disambig 0.0.1
plugin-image 0.3.0
plugin-person 0.2.0
plugin-summary 0.3.0
plugin-wikitext 1.1.0
plugin-wikinews 0.0.1
plugin-wikivoyage 0.0.1
plugin-wiktionary 0.0.1
Published by spencermountain about 4 years ago
fix reference json encoding for mongodb
Published by spencermountain about 4 years ago
Published by spencermountain over 4 years ago
wikidata()
methoddomain()
methodPublished by spencermountain over 4 years ago
Published by spencermountain over 4 years ago
fix #260 and #348
Published by spencermountain over 4 years ago
.extend()
.dates()
from sentence class (didn't work)ref-list
template, keep otherwise empty ==References==
sectionsPublished by spencermountain over 4 years ago
track changes to covid templates
Published by spencermountain over 4 years ago
bugfix for table parser
Published by spencermountain over 4 years ago
.json()
result.template('foo')
<noinclude>
.url()
and .language()
methodsLink.href()
methodPublished by spencermountain over 4 years ago
.html()
, .latex()
, and .markdown()
to their respective plugins
.templates()
and .links()
return Template and Link objects, and not bare JSON (use .map(l=> l.json())
).fetch()
Image.exists()
method to plugin
.extend()
.plaintext()
in favour of .text()
Published by spencermountain over 5 years ago
.json()
Published by spencermountain almost 6 years ago
.paragraphs()
.json()
. cleaning-up redundant data.⚠️templates
data (found in section
) - resume it with {templates:true}
coordinates
data (found in templates
) - resume it with {coordinates:true}
citations
data (found in section
) - resume it with {citations:true}
.json()
again ¯_(:/)_ /¯options.title
for sections to options.headers
.citations()
--> .citations().map(c => c.json());
.wikitext()
and .reparse()
methods - keeping wikitext stateful caused too many issuesImage.file
into a functioninterwiki()
results in .links()
follow_redirects
option to fetchupload.wikimedia.org/wikipedia/commons
to wikipedia.org/wiki/Special:Redirect/file/
via #86{{foo start}}...{{foo end}}
templates[]
for some more section properties in .json()
responsePublished by spencermountain about 6 years ago
last stable release before v6
from changelog:
5.1.0
improved support for gallery tag
more support for wiktionary grammar templates
tweak some regexes
5.2.0
make .json() results return proper json for tables
5.3.0
add infobox html back into html output (tentative)
redirect support in .json(), .html() output
remove empty [] properties in .json() results (saves disk space!)
keep # anchor data in .links()
show links default-on in latex output, like in md and html
render html/latex/json 'soft redirect', instead of blank pages
Published by spencermountain about 6 years ago
.parse()
to main wtf()
methodfetch()
method (when available)wtf_wikipedia Toronto --plaintext
.wikitext()
method to Document, Section, Sentence (thanks @niebert).templates()
are now an ordered array, instead of an object, and include infoboxes and citations.get('key')
method to Infobox classdate
data from Sentence to Section object.options
param in parser (but keep options param for output methods).links()
resultssentences.bolds(0)
, and the likesection(0).wikitext()
<gallery>
tag support in .images()
Table
class and List
classescol1
instead of col-0
options.verbose_template
for debuggingPublished by spencermountain over 7 years ago
breaking change with 0x, sections are now formatted as an array of objects, with depth information.
tables are parsed into an array of key-value pairs.
options object is removed.
all is refactored