Collection of datasets used for Optical Music Recognition
MIT License
Published by apacha over 4 years ago
Added the download of the DeepScores V1 dataset (subset with extended vocabulary) to the available datasets.
Also slightly rewrote the downloader, so you can specify a custom URL to any zip file that will be downloaded and extracted, in case your dataset is hosted elsewhere.
Published by apacha over 4 years ago
Fixed an error in the dependency management of setup.py which prevented normal installation.
Also changing to semantic versioning with three numbers Major.Minor.Revision.
Published by apacha over 4 years ago
Updated MuscimaPlusPlusSymbolImageGenerator to work with MUSCIMA++ 2.0.
Added quality-of-life improvement suggested by @yvan674 to make importing
common classes such as the downloader easier.
Published by apacha over 4 years ago
This release is a major new version that is incompatible with previous versions.
The project was dramatically simplified to ease common tasks such as downloading datasets two become much more user-friendly.
Removed mostly unused code and re-organized project structure and documentation.
Published by apacha about 5 years ago
Release of MaskImageGenerator to support three types of masks being generated from the MUSCIMA++ dataset, Version 2.0:
Published by apacha about 5 years ago
This release includes support for the new MUSCIMA++ v2.0 dataset.
Published by apacha over 5 years ago
In this release, we added a derived datasest from the MUSCIMA++ dataset, that contains annotations for staves, system measures and stave measures.
Further changes:
Published by apacha over 5 years ago
Moving all datasets to Github and mirroring existing datasets to improve performance and ensure they will be preserved, even if the original sites shut down.
Mirroring:
Hosting:
Published by apacha almost 6 years ago
MuscimaPlusPlusDatasetDownloader now downloads the annotations and the annotated images together, to avoid the need for downloading 2GB if you just need the 7MB of images, that were actually annotated.
Published by apacha about 6 years ago
This release features a multi-condition version of the CVC-MUSCIMA that consists of files from the Staff-Removal and the Writer-Identification datasets. The images from both datasets have been aligned to have exactly the same dimensions and overlap as good as possible. The dataset contains a total of 10000 png files with 10 different conditions: grayscale, binary, interrupted, kanungo, staffline-thickness-variation-v1, staffline-thickness-variation-v2, staffline-y-variation-v1, staffline-y-variation-v2, typeset-emulation and whitespeckles.
Note that these augmented version are just taken from the Staff-Removal dataset and inverted to be consistent with all other images black on white background.
Published by apacha about 7 years ago
Version 0.6 now includes the Capitan dataset with a rudimentary rendering script that allows to render both the image-data as well as the strokes. Note, that the dataset is inconclusive and therefore different from the second version of the original authors.
Published by apacha about 7 years ago
First actual release on PyPI that works