xray compares media files by their perceptual hash and identifies dupes
MIT License
xray compares media files by their perceptual hash and identifies dupes.
This means files are not compared byte-for-byte, but by their visual content. It will (or at least it should) find duplicates of videos, even when they are encoded in different formats, bitrates and/or resolutions.
xray scans a given directory. Each video file it finds gets analyzed by taking several snapshots of the video content. Those images are then p-hashed and compared with all other videos it found. If xray considers two videos similar enough it will output that it found a dupe and additionally shows you a similarity score.
In case xray thinks it found a perfect match, it will also calculate the sha1sum of both videos to safely identify exact copies of a file.
xray depends on QtCore (Qt >= 5.2), ffmpeg and phash. You should be able to find existing packages for your system. On Ubuntu install "libphash0-dev", on Arch install "phash" from AUR, on OS X "brew install phash".
qmake xray.pro
make
./xray /media/videos
So far this is just a working proof-of-concept. Plans to enhance & improve xray:
Enjoy!