This project is a Proof of Concept (PoC) for a PDF Analyzer tool. It provides a graphical user interface for loading PDF files, analyzing their layout, and visualizing the results.
Clone this repository:
git clone https://github.com/yourusername/pdf_analyzer_poc.git
cd pdf_analyzer_poc
Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
Install the required dependencies:
pip install -r requirements.txt
To run the application, execute the following command from the project root directory:
python gui.py
This will launch the graphical user interface. From there, you can:
gui.py
: Main entry point of the application, contains the GUI implementationpdf_processor.py
: Handles loading and processing of PDF filesimage_analyzer.py
: Contains functions for image preprocessing and layout analysismain.py
: Currently not used, may be used for future command-line interfacerequirements.txt
: Lists all Python dependencies for the projecttest_pdfs/
: Directory containing PDF files for testing (ignored in git except for sample.pdf)This is a Proof of Concept project. For any major changes, please open an issue first to discuss what you would like to change.