Web scraping made simple.
yarn add tokio
const Tokio = require('tokio')
const tokio = new Tokio({
url: 'https://some-website.com'
})
tokio.fetch().then(html => {
console.log(html) //=> string
// Query HTML with cheerio (server-side jQuery)
// https://github.com/cheeriojs/cheerio
const $ = tokio.query(html)
})
string
required
The URL to fetch.
number
string
50
Wait for certain time (in milliseconds) or dom element to show up.
boolean
string
Instead of using options.wait, you can manually call window.__tokio_ready__()
in your website to tell us that it's ready to be captured.
It can also be a string like i_am_ready
so that you can call window.i_am_ready()
instead.
resource => boolean
Whether to load certain resource. Check out the resource type.
proxy
: string
A URL for a HTTP proxy to use for the requests.agent
: http(s).Agent instance to use.agentOptions
: The agent options; defaults to { keepAlive: true, keepAliveMsecs: 115000 }
, see http api for more details.strictSSL
: If true
, requires SSL certificates be valid; defaults to true
, see request module for more details.userAgent
: The user agent string used in requests; defaults to Node.js (#process.platform#; U; rv:#process.version#)
headers
: An object giving any headers that will be used while loading the HTML from options.url
, if applicable.Inject variables to the global scope window
.
() => Promise<string>
Fetch URL and return corresponding HTML. (JavaScript on this page will be evaluated.)
This is basically cheerio.load(html, opts)
.
git checkout -b my-new-feature
git commit -am 'Add some feature'
git push origin my-new-feature
tokio © egoist, Released under the MIT License. Authored and maintained by egoist with help from contributors (list).
github.com/egoist · GitHub @egoist · Twitter @_egoistlily