mail-deduplicate

📧 CLI to deduplicate mails from mail boxes.

GPL-2.0 License

Downloads
461
Stars
159
Committers
21

What is Mail Deduplicate?

Provides the mdedup CLI, an utility to deduplicate mails from a set of boxes.

Features

  • Duplicate detection based on cherry-picked and normalized mail
    headers.
  • Fetch mails from multiple sources.
  • Reads and writes to mbox, maildir, babyl, mh and mmdf
    formats.
  • Deduplication strategies based on size, content, timestamp, file path
    or random choice.
  • Copy, move or delete the resulting set of duplicates.
  • Dry-run mode.
  • Protection against false-positives with safety checks on size and content differences.
  • Supports macOS, Linux and Windows.
  • Standalone executables for Linux, macOS and Windows.
  • Shell auto-completion for Bash, Zsh and Fish.

⚠️ Warning: Performances

mdedup implementation is quite naive at the moment and everything resides in memory.

If this is good enough for a volume of a couple of gigabytes, the more emails mdedup try to parse, the closer you'll reach the memory limits of your machine. In which case mdedup will exit abrubtly, zapped by the OOM killer of your OS. Of course your mileage may vary depending on your hardware.

You can influence implementation of this feature with pull requests, or purchase of business support 🤝 and sponsorship 🫶.

Example

Installation

From sources

Easiest way is to install mdedup from sources with pipx:

$ pipx install mail-deduplicate

Other alternatives installation methods are available in the documentation.

Executables

Standalone executables of mdedup's latest version are available for several platforms and architectures:

Platform x86_64 arm64
Linux Download mdedup-linux-x64.bin
macOS Download mdedup-macos-x64.bin Download mdedup-macos-arm64.bin
Windows Download mdedup-windows-x64.exe
Package Rankings
Top 11.54% on Pypi.org
Badges
Extracted from project README
Last release Python versions Unittests status Documentation status Coverage status DOI