This allows you to clean a PDF and it's embedded images from any metadata.
GPL-2.0 License
This allows you to clean a PDF and it's embedded images from any metadata. Please leave any comments in the issues section, and I welcome a security and thorough code review. Thanks and enjoy.
If you are running OSX, you can try the install.sh (coming soon) file to install all the prerequisites. Otherwise manually find the following: QPDF, pdftk, pdfinfo.
the easy way with homebrew
brew update
brew install qpdf
as of writing the easiest way is just to download the latest package that works for El Capitan, when a homebrew version becomes available, I will hopefully update this. download https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk_server-2.02-mac_osx-10.11-setup.pkg
the fastest way is to download the precombiled binary from foolabs for xpdf, which includes pdfinfo:
move to your proper downloads folder to stay organized
cd ~/Downloads
get precompiled binary of xpdf which included pdfinfo
curl ftp://ftp.foolabs.com/pub/xpdf/xpdfbin-mac-3.04.tar.gz
untar
tar -xzvf xpdfbin-mac-3.04.tar.gz
move the executables to the proper location
cd xpdfbin-mac-3.04
cd bin64
cp * /usr/local/bin
To run, make sure you have the prereqs installed then drop the .sh file into your bin folder somewhere in your path, I use /usr/local/bin
, so you can do a git clone https://github.com/jamesacampbell/pdf-cleaner.git
into any folder then do a ln -s pdf-cleaner/clean_pdf.sh /usr/local/bin
or cp pdf-cleaner/clean_pdf.sh /usr/local/bin/
Then run sudo chmod +x clean_pdf.sh
to make it executable (assuming you cd to properly location first).
Then simply anywhere in your path where your PDF exists, for example cd ~/Downloads
then run clean_pdf.sh filename.pdf
.
The script saves the original and saves a .clean.pdf version.
I found the original in a github gist and decided to make a project out of it for easy deployment.