PoC bulk search your pdf files using fuzzy text look up.
Requirements
Run this project
1.Clone project and submodules: run git clone --recurse-submodules https://github.com/HazemBZ/pdf-fuzz
.
2.Drop a folder with pdf files inside pdf_fuzz_back/assets
folder (smaller number of files -> less time to process).
3.Index db with pdf contents: docker-compose exec backend bash -c "python manage.py reindex"
.
4.Spin up containers: run docker-compose up
.
Update your pdf file
After changing the contents of pdf_fuzz_back/assets
, reindex with: docker-compose exec backend bash -c "python manage.py reindex"
.