To extract main article from given URL with Node.js
MIT License
Bot releases are hidden (Show)
Published by ndaidong 6 months ago
Related issues: #386, #320
Thanks to the advices from @martinrotter 🤝
Published by ndaidong 7 months ago
Related issue: #382
Published by ndaidong 8 months ago
Published by ndaidong 9 months ago
Published by ndaidong 11 months ago
Published by ndaidong about 1 year ago
Published by ndaidong about 1 year ago
childNodes
instead of children
to get the same behaviour as Deno DOMPublished by ndaidong about 1 year ago
Try to fix the issues: #359 #360
Published by ndaidong about 1 year ago
Related issues: #345 #357
Published by ndaidong over 1 year ago
Published by ndaidong over 1 year ago
signal
Example with signal
import { extract } from '@extractus/article-extractor'
const url = 'https://www.cnbc.com/2022/09/21/what-another-major-rate-hike-by-the-federal-reserve-means-to-you.html'
const article = await extract(url, null, {
signal: AbortSignal.timeout(5000),
})
console.log(article)
Published by ndaidong over 1 year ago
agent
Published by ndaidong over 1 year ago
agent
to fetchOptions
Example article extraction via proxy server with agent
import { extract } from '@extractus/article-extractor'
import { HttpsProxyAgent } from 'https-proxy-agent'
const proxy = 'http://abc:[email protected]:31113'
const url = 'https://www.cnbc.com/2022/09/21/what-another-major-rate-hike-by-the-federal-reserve-means-to-you.html'
const article = await extract(url, {}, {
agent: new HttpsProxyAgent(proxy),
})
console.log('Run article-extractor with proxy:', proxy)
console.log(article)
Published by ndaidong over 1 year ago
Published by ndaidong over 1 year ago
string-similarity
Published by ndaidong over 1 year ago
Published by ndaidong over 1 year ago
Published by ndaidong over 1 year ago
parserOptions
is nullPublished by ndaidong over 1 year ago