Scrape

CLI utility to scrape emails from websites

Features

Asynchronous scraping
Recursive link follow
External link follow
Cloudflare email obfuscation decoding
Client side rendered pages support through headless chromium load awaits
Simple, grepable output

Install

MacOS:

brew tap lawzava/scrape https://github.com/lawzava/scrape
brew install scrape

Linux:
```
sudo snap install scrape
```

Usage

Sample call:

scrape -w https://lawzava.com

Depends on chromium or google-chrome being available in path if --js is used

Parameters:

      --async             Scrape website pages asynchronously (default true)
      --debug             Print debug logs
  -d, --depth int         Max depth to follow when scraping recursively (default 3)
      --follow-external   Follow external 3rd party links within website
  -h, --help              help for scrape
      --js                Enables EnableJavascript execution await
      --output string     Output type to use (default 'plain', supported: 'csv', 'json') (default "plain")
      --output-with-url   Adds URL to output with each email
      --recursively       Scrape website recursively (default true)
      --timeout int       If > 0, specify a timeout (seconds) for js execution await
  -w, --website string    Website to scrape (default "https://lawzava.com")

Note about scraper package

For those that are looking for scraper package - this repository was intended as a cli-use only thus the scraper package was moved to lawzava/emailscraper. The scrape utility will be maintained as a CLI implementation of emailscraper package.

Package Rankings

Top 3.97% on Proxy.golang.org

Badges

Extracted from project README

Related Projects

urlgrab

A golang utility to spider through a website searching for additional links.

02 Jul 2020 327

emailscraper

Minimalistic library to scrape emails from websites with headless browser support.

06 Mar 2021 20

flyscrape

Flyscrape is a command-line web scraping tool designed for those without advanced programming ski...

28 Aug 2023 1,035

goscrape

Web scraper that can create an offline readable version of a website

13 Feb 2017 174