scrapy-scraper

Web crawler and scraper based on Scrapy and Playwright's headless browser.

MIT License

Downloads
665
Stars
9
Committers
1

Bot releases are hidden (Show)

scrapy-scraper - v2.1 Latest Release

Published by ivan-sincek about 1 month ago

v2.1 release notes:

  • minor code refactoring,
  • fixed the random sleep time between two consecutive requests by adding a new -rs / --random-sleep option,
  • fixed few other minor bugs,
  • updated the user agents list.

Web crawler and scraper written in Python and based on Scrapy and Playwright's headless browser.

scrapy-scraper - v1.7

Published by ivan-sincek 5 months ago

v1.7 release notes:

  • removed redundant argparse dependency,
  • updated the user agents list,
  • version bump.

Web crawler and scraper written in Python and based on Scrapy and Playwright's headless browser.

scrapy-scraper - v1.6

Published by ivan-sincek 7 months ago

v1.6 release notes:

  • auto throttling bug fix and value changes for the -at option,
  • added new -s option for sleeping between two consecutive requests to the same domain,
  • updated the user agents list,
  • version bump.

Web crawler and scraper written in Python and based on Scrapy and Playwright's headless browser.

scrapy-scraper - v1.5

Published by ivan-sincek 7 months ago

v1.5 release notes:

  • install instructions update,
  • updated the user agents list,
  • version bump.

Web crawler and scraper written in Python and based on Scrapy and Playwright's headless browser.

scrapy-scraper - v1.4

Published by ivan-sincek 10 months ago

v1.4 release notes:

  • argument parsing code rebase and improved few other stuff,
  • boolean options / arguments no longer require yes or no value,
  • directory -d option has been changed to directory -dir option,

Web crawler and scraper written in Python and based on Scrapy and Playwright's headless browser.

scrapy-scraper - v1.1

Published by ivan-sincek 11 months ago

Web crawler and scraper written in Python and based on Scrapy and Playwright's headless browser.