======================
Anime Related Crawlers

A collection of self-using anime-related crawlers.

Supported sites:

Image crawler for danbooru.donmai.us, deviantart.com
File crawler for sakugabooru.com
Torrent crawler for nyaa.si, share.dmhy.org, acg.rip, bangumi.moe
Anime infomation crawler for bangumi.tv

Development

Structure

.. code-block::

.
├── Pipfile             # Python package management
├── README.rst
├── Pipfile.lock
├── scrapy.cfg          # scrapy config file
├── anime_spiders       # Spiders
├── manage.py           # Django manage.py
├── exhibition          # Django backend application
├── db.sqlite3
├── package.json        # Frontend package management
├── package-lock.json
├── node_modules        # Frontend dependencies
├── index.html          # index.html of frontend
├── src                 # Frontend application source
├── build               # Frontend build
├── config              # Frontend code build configs
├── dist                # Distribution code of frontend
└── static              # Frontend related static files

Installation & Running

Run frontend: npm run dev;
Run backend: ./manage.py runserver;
Run a spider:
- Start a ElasticSearch server at 192.168.2.10;
- Install requirements: pipenv install && pipenv install --dev && pipenv shell;
- Run spider scrapy crawl [spider_name];

Clean code

Use yapf to format Python code::

yapf -irp -e "./.venv/**" -e "**/migrations/**" **/**.py

Usage

Terminal

Only scrapy commands supported for now.

Library

You can use it as normal scrapy.Spider of course.

Scrapyd

Not supported yet.

Related Projects

djangoscraper

Django Scrapy App

26 Aug 2009 17

boris-spider

boris-spider是一款使用Python语言编写的爬虫框架，于多年的爬虫业务中不断磨合而诞生，相比于scrapy，该框架更易上手，且又满足复杂的需求，支持分布式及批次采集。

22 Apr 2020 82

scrappy

scrapy best practice

02 Mar 2016 37

LiSpider

04 Apr 2016 6

SpiderKeeper

admin ui for scrapy/open source scrapinghub

18 Jan 2016 2,738

jp-av-crawler

a scrapy crawler for jav library

28 Apr 2014 14

ECommerceCrawlers

实战🐍多种网站、电商数据爬虫🕷。包含🕸：淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼、阿里任务、博客园、微博、百度贴吧、豆瓣电影、包图网、全景网、豆瓣音乐、某省药监局、搜狐新闻、机器学...

29 Mar 2019 4,682

imdb-web-scraper

IMDB web scraper using Scrapy framework. Flask server for data visualization

28 Mar 2019 1

feapder

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单，功能强大的Python爬虫框架。内置AirSp...

08 Feb 2021 2,596

Spider

web crawler

27 Feb 2018 36

scrapy-examples

Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.

11 Jan 2014 3,171

nyan

NYAN is a news filtering engine written in Python and some Ruby.

05 Feb 2013 15

News-Aggregator

Django project to scrape a news website using Beautiful soup and display in our template.

24 Apr 2020 126

scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

14 Apr 2015 1,182

OpenScraper

An open source webapp for scraping: towards a public service for webscraping

20 Feb 2018 92

anime_spiders

====================== Anime Related Crawlers