A general interface for clustering based over-sampling algorithms.
MIT License
The project has been moved to imbalanced-learn-extra.
Category | Tools |
---|---|
Development | |
Package | |
Documentation | |
Communication |
A general interface for clustering based over-sampling algorithms.
For user installation, cluster-over-sampling
is currently available on the PyPi's repository, and you can
install it via pip
:
pip install cluster-over-sampling
Development installation requires to clone the repository and then use PDM to install the project as well as the main and development dependencies:
git clone https://github.com/georgedouzas/cluster-over-sampling.git
cd cluster-over-sampling
pdm install
SOM clusterer requires optional dependencies:
pip install cluster-over-sampling[som]
All the classes included in cluster-over-sampling
follow the imbalanced-learn API using the functionality of the base
oversampler. Using scikit-learn convention, the data are represented as follows:
X
: 2D array-like or sparse matrices.y
: 1D array-like.The clustering-based oversamplers implement a fit
method to learn from X
and y
:
clustering_based_oversampler.fit(X, y)
They also implement a fit_resample
method to resample X
and y
:
X_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)
If you use cluster-over-sampling
in a scientific publication, we would appreciate citations to any of the following papers: