PDFLayoutTextStripper

Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).

APACHE-2.0 License

Stars

1.6K

View Code on GitHub Visit Website

Ecosystems: Java

Bot releases are visible (Hide)

PDFLayoutTextStripper - v2.2.5 Latest Release

Published by JonathanLink about 3 years ago

Add license in source code

PDFLayoutTextStripper - v2.2.4

Published by JonathanLink about 3 years ago

Update licence + apache pdfbox dependency

PDFLayoutTextStripper - v2.2.3

Published by JonathanLink almost 6 years ago

Fix a bug related to whitespaces (thanks to Dmytro Zelinskyy)

PDFLayoutTextStripper - v.2.2.2

Published by JonathanLink about 7 years ago

PDFLayoutTextStripper - v.2.2.1

Published by JonathanLink about 7 years ago

PDFLayoutTextStripper - v2.2

Published by JonathanLink about 7 years ago

PDFLayoutTextStripper - v2.1

Published by JonathanLink over 7 years ago

PDFLayoutTextStripper for PDFBox >= 2.0

PDFLayoutTextStripper - v1.0

Published by JonathanLink over 7 years ago

PDFLayoutTextStripper for PDFBox < 2.0

Package Rankings

Top 15.91% on Repo1.maven.org

Related Projects

docx4j

JAXB-based Java library for Word docx, Powerpoint pptx, and Excel xlsx files

11 May 2012 2,100

OpenPDF

OpenPDF is a free Java library for creating and editing PDF files, with a LGPL and MPL open sourc...

11 Jul 2016 3,303

paper2ebook

Utility to re-structure research papers published in US Letter or A4 format PDF files to typicall...

07 Nov 2010 53

openhtmltopdf

An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image s...

04 Nov 2015 1,913

itext-java

iText for Java represents the next level of SDKs for developers that want to take advantage of th...

03 May 2016 1,839

tabula-java

Extract tables from PDF files

22 May 2014 1,731

pdfbox

Mirror of Apache PDFBox

26 Sep 2009 2,632

PDFData

26 Jun 2015 7

pdf-table

Java utility for parsing PDF tabular data using Apache PDFBox and OpenCV

19 Feb 2017 69