Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
APACHE-2.0 License
Bot releases are hidden (Show)
Published by apify-service-account 10 months ago
retryOnBlocked
doesn't override the blocked HTTP codes (#2243) (81672c3)--no-purge
(#2244) (83f3179)Published by apify-service-account 11 months ago
skipNavigation
option to enqueueLinks
(#2153) (118515d)--no-sandbox
flag for webkit launcher (#2148) (1eb2f08), closes #1797
RequestList.open()
+ improve docs (#2158) (c5a1b07)Published by apify-service-account about 1 year ago
Published by apify-service-account about 1 year ago
Published by apify-service-account about 1 year ago
Published by apify-service-account about 1 year ago
Published by apify-service-account about 1 year ago
inProgress
cache when delaying requests via sameDomainDelaySecs
(#2045) (f63ccc0)RequestQueue
instance (845141d), closes #2043
Published by apify-service-account about 1 year ago
Published by apify-service-account about 1 year ago
Published by apify-service-account about 1 year ago
vitest
(#2004) (d2e098c), closes #1999
requestsFromUrl
) to the queue in batches (418fbf8), closes #1995
enqueueLinks
explicitly provided via urls
option (#2014) (cbd9d08), closes #2005
closeCookieModals
context helper for Playwright and Puppeteer (#1927) (98d93bb)sameDomainDelaySecs
(#2003) (e796883), closes #1993
RequestQueue.addBatchedRequests()
in enqueueLinks
helper (4d61ca9), closes #1995
Published by apify-service-account over 1 year ago
<base>
when enqueuing (#1936) (aeef572)Published by B4nan over 1 year ago
RequestQueue
(#1899) (063dcd1)SessionPool
(#1881) (db069df)parseWithCheerio
helper to HttpCrawler
(#1906) (ff5f76f)Published by B4nan over 1 year ago
runScripts
are enabled (806de31)enqueueLinks
in http crawlers when parsing fails (fd35270)