Native Julia implementation of CPMerge (SimString) algorithm
MIT License
A native Julia implementation of the CPMerge algorithm, which is designed for approximate string matching. This package is be particulary useful for natural language processing tasks which demand the retrieval of strings/texts from a very large corpora (big amounts of texts). Currently, this package supports both Character and Word based N-grams feature generations and there are plans to open the package up for custom user defined feature generation methods.
You can grab the latest stable version of this package from Julia registries by simply running;
NB: Don't forget to invoke Julia's package manager with ]
pkg> add SimString
The few (and selected) brave ones can simply grab the current experimental features by simply adding the master branch to your development environment after invoking the package manager with ]
:
pkg> add SimString#main
You are good to go with bleeding edge features and breakages!
To revert to a stable version, you can simply run:
pkg> free SimString