A Website Crawler Implementation written in PHP. High extendible, Indexes PDFs and is very memory efficient.
MIT License
Bot releases are hidden (Show)
smalot/pdfparser
.Published by nadar almost 4 years ago
'maxSize' => false
or increase the limit 'maxSize' => 15000000
(which is 15MB for example). The value must be provided in Bytes. The main goal is to ensure that the PDF Parser won't run into very large memory consumption. This restriction won't stop the Crawler from downloading the URL (whether its large the the maxSize definition or not), but preventing memory leaks when the Parsers start to interact with the response content.Published by nadar almost 4 years ago
Published by nadar about 4 years ago