Description

RobotScraper is an open-source tool designed to scrape and analyze the robots.txt file of a specified domain. This Python script helps in identifying directories and pages that are allowed or disallowed by the robots.txt file and can save the results if needed. It is useful for web security researchers, SEO analysts, and anyone interested in examining the structure and access rules of a website.

Requirements

Python 3.x
requests package
beautifulsoup4 package

Installation

Clone the repository:

git clone https://github.com/robotshell/robotScraper
cd robotScraper

Install the required Python packages:
```
pip install requests beautifulsoup4
```

Usage

To run the RobotScraper, you can use the following command syntax:

python robotScraper.py domain [-s output.txt]

Disclaimer

This tool is intended for educational and research purposes only. The author and contributors are not responsible for any misuse of this tool. Users are advised to use this tool responsibly and only on systems for which they have explicit permission. Unauthorized access to systems, networks, or data is illegal and unethical. Always obtain proper authorization before conducting any kind of activities that could impact other users or systems.

Related Projects

scrapy-scraper

Web crawler and scraper based on Scrapy and Playwright's headless browser.

13 Apr 2023 9

mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples

30 Jul 2020 1,290

GoogleScraper

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). In...

06 Dec 2013 2,630

OpenScraper

An open source webapp for scraping: towards a public service for webscraping

20 Feb 2018 92

web_scraper_challenge

Web scraper I did for interview. Now, contributions welcome!!

10 Aug 2024 0

dorkScraper

DorkScraper is a simple tool written in Python to extract all the urls that appear when using a G...

29 Jun 2021 33

webscraping-from-0-to-hero

The web scraping open project repository aims to share knowledge and experiences about web scrapi...

26 May 2022 1,533

Uscrapper

Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable i...

31 May 2023 490

scraping_tutorial

Basics of scraping with python, requests, beautifulsoup4, selenium, etc.

17 Oct 2019 1

ScrapPY

ScrapPY is a Python utility for scraping manuals, documents, and other sensitive PDFs to generate...

04 Nov 2022 189

GlobalAntiScamOrg-blocklist

Machine-readable .txt blocklist of scam URLs and IP Addresses from the Global Anti Scam Organizat...

21 Feb 2022 30

Web-Scraping-with-Beautiful-Soup-and-Selenium

This repository offers a guide to web scraping with Beautiful Soup and Selenium. It covers data e...

09 Jul 2024 0

WebScrape

Web + Command Line Webscraper Tool!

17 Sep 2024 2

autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

31 Aug 2020 6,197

Link-scraper-in-python

A Python script to scrap all links in a given website using requests and Beautiful soup

26 Jan 2021 8