PDFLayoutTextStripper

Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).

APACHE-2.0 License

Stars
1.6K

Bot releases are visible (Hide)

PDFLayoutTextStripper - v2.2.5 Latest Release

Published by JonathanLink about 3 years ago

Add license in source code

PDFLayoutTextStripper - v2.2.4

Published by JonathanLink about 3 years ago

Update licence + apache pdfbox dependency

PDFLayoutTextStripper - v2.2.3

Published by JonathanLink almost 6 years ago

Fix a bug related to whitespaces (thanks to Dmytro Zelinskyy)

PDFLayoutTextStripper - v.2.2.2

Published by JonathanLink about 7 years ago

PDFLayoutTextStripper - v.2.2.1

Published by JonathanLink about 7 years ago

PDFLayoutTextStripper - v2.2

Published by JonathanLink about 7 years ago

PDFLayoutTextStripper - v2.1

Published by JonathanLink over 7 years ago

PDFLayoutTextStripper for PDFBox >= 2.0

PDFLayoutTextStripper - v1.0

Published by JonathanLink over 7 years ago

PDFLayoutTextStripper for PDFBox < 2.0