Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).
APACHE-2.0 License
Bot releases are visible (Hide)
Add license in source code
Published by JonathanLink about 3 years ago
Update licence + apache pdfbox dependency
Published by JonathanLink almost 6 years ago
Fix a bug related to whitespaces (thanks to Dmytro Zelinskyy)
Published by JonathanLink about 7 years ago
Published by JonathanLink about 7 years ago
Published by JonathanLink about 7 years ago
Published by JonathanLink over 7 years ago
PDFLayoutTextStripper for PDFBox >= 2.0
Published by JonathanLink over 7 years ago
PDFLayoutTextStripper for PDFBox < 2.0