IMDB web scraper using Scrapy framework. Flask server for data visualization
Scrapy is a python framework for scraping data and crawling websites. I have created various crawlers to learn Scrapy and improve my Python skills
This repository contains various Scrapy demo spiders.
It also contains a simple http server to view the scraped data from the spiders.
The spiders save their data to an SQLite3 database. The website queries data from the database.
I recommend using virtualenv to isolate your project dependencies
Install virtualenv
sudo pip3 install --user virtualenv
sudo -H
with newer versionsCreate a new virtual environment with venv
virtualenv env
Active the virtual environment
source env/bin/activate
Install the package dependencies
pip install -r requirements.txt
Run a spider using the name defined within the class
scrapy crawl movies
movies
quotes
books
Run scrapy interactively to test html selectors
scrapy shell [url]
response.css('div.summary::text').get()
set FLASK_APP=server
set FLASK_ENV=development
flask run