Bot releases are hidden (Show)

imbalanced-learn - Imbalanced-learn 0.12.3 Latest Release

Published by glemaitre 5 months ago

Changelog

Compatibility

Compatibility with scikit-learn 1.5 #1074 and #1084 by Guillaume Lemaitre.

imbalanced-learn - Imbalanced-learn 0.12.2

Published by glemaitre 7 months ago

Changelog

Bug fixes

Fix the way we check for a specific Python version in the test suite. #1075 by Guillaume Lemaitre.

imbalanced-learn - Imbalanced-learn 0.12.1

Published by glemaitre 7 months ago

Changelog

Bug fixes

Fix a bug in InstanceHardnessThreshold where estimator could not be a Pipeline object. #1049 by Gonenc Mogol.

Compatibility

Do not use distutils in tests due to deprecation. #1065 by Michael R. Crusoe.
Fix the scikit-learn import in tests to be compatible with version 1.4.1.post1. #1073 by Guillaume Lemaitre.
Fix test to be compatible with Python 3.13. #1073 by Guillaume Lemaitre.

imbalanced-learn - Imbalanced-learn 0.12.0

Published by glemaitre 9 months ago

Changelog

Bug fixes

Fix a bug in SMOTENC where the entries of the one-hot encoding should be divided by sqrt(2) and not 2, taking into account that they are plugged into an Euclidean distance computation. #1014 by Guillaume Lemaitre.
Raise an informative error message when all support vectors are tagged as noise in SVMSMOTE. #1016 by Guillaume Lemaitre.
Fix a bug in SMOTENC where the median of standard deviation of the continuous features was only computed on the minority class. Now, we are computing this statistic for each class that is up-sampled. #1015 by Guillaume Lemaitre.
Fix a bug in SMOTENC such that the case where the median of standard deviation of the continuous features is null is handled in the multiclass case as well. #1015 by Guillaume Lemaitre.
Fix a bug in BorderlineSMOTE version 2 where samples should be generated from the whole dataset and not only from the minority class. #1023 by Guillaume Lemaitre.
Fix a bug in NeighbourhoodCleaningRule where the kind_sel="all" was not working as explained in the literature. #1012 by Guillaume Lemaitre.
Fix a bug in NeighbourhoodCleaningRule where the threshold_cleaning ratio was multiplied on the total number of samples instead of the number of samples in the minority class. #1012 by Guillaume Lemaitre.
Fix a bug in RandomUnderSampler and RandomOverSampler where a column containing only NaT was not handled correctly. #1059 by Guillaume Lemaitre.

Compatibility

BalancedRandomForestClassifier now support missing values and monotonic constraints if scikit-learn >= 1.4 is installed.
Pipeline support metadata routing if scikit-learn >= 1.4 is installed.
Compatibility with scikit-learn 1.4. #1058 by Guillaume Lemaitre.

Deprecations

Deprecate estimator_ argument in favor of estimators_ for the classes CondensedNearestNeighbour and OneSidedSelection. estimator_ will be removed in 0.14. #1011 by Guillaume Lemaitre.
Deprecate kind_sel in #1012 by Guillaume Lemaitre.

Enhancements

Allows to output dataframe with sparse format if provided as input. #1059 by ts2095.

imbalanced-learn - imbalanced-learn 0.11.0

Published by glemaitre over 1 year ago

Changelog

Bug fixes

Fix a bug in classification_report_imbalanced where the parameter target_names was not taken into account when output_dict=True. #989 by AYY7.
SMOTENC now handles mix types of data type such as bool and pd.CategoricalDtype by delegating the conversion to scikit-learn encoder. #1002 by Guillaume Lemaitre.
Handle sparse matrices in SMOTEN and raise a warning since it requires a conversion to dense matrices. #1003 by Guillaume Lemaitre.
Remove spurious warning raised when minority class get over-sampled more than the number of sample in the majority class. #1007 by Guillaume Lemaitre.

Compatibility

Maintenance release for being compatible with scikit-learn >= 1.3.0. #999 by Guillaume Lemaitre.

Deprecation

The fitted attribute ohe_ in SMOTENC is deprecated and will be removed in version 0.13. Use categorical_encoder_ instead. #1000 by Guillaume Lemaitre.
The default of the parameters sampling_strategy and replacement will change in BalancedRandomForestClassifier to follow the implementation of the original paper. This changes will take effect in version 0.13. #1006 by Guillaume Lemaitre.

Enhancements

SMOTENC now accepts a parameter categorical_encoder allowing to specify a OneHotEncoder with custom parameters. #1000 by Guillaume Lemaitre.
SMOTEN now accepts a parameter categorical_encoder allowing to specify a OrdinalEncoder with custom parameters. A new fitted parameter categorical_encoder_ is exposed to access the fitted encoder. #1001 by Guillaume Lemaitre.
RandomUnderSampler and RandomOverSampler (when shrinkage is not None) now accept any data types and will not attempt any data conversion. #1004 by Guillaume Lemaitre.
SMOTENC now support passing array-like of str when passing the categorical_features parameter. #1008 by :userGuillaume Lemaitre <glemaitre>.
SMOTENC now support automatic categorical inference when categorical_features is set to "auto". #1009 by :userGuillaume Lemaitre <glemaitre>.

imbalanced-learn - imbalanced-learn 0.10.1

Published by glemaitre over 1 year ago

Changelog

Bug fixes

Fix a regression in over-sampler where the string minority was rejected as an unvalid sampling strategy. #964 by Prakhyath07.

imbalanced-learn - imbalanced-learn 0.10.0

Published by glemaitre almost 2 years ago

Changelog

Bug fixes

Make sure that Substitution is working with python -OO that replaces doc by None. #953 bu Guillaume Lemaitre.

Compatibility

Maintenance release for being compatible with scikit-learn >= 1.0.2. #946, #947, #949 by Guillaume Lemaitre.
Add support for automatic parameters validation as in scikit-learn >= 1.2. #955 by Guillaume Lemaitre.
Add support for feature_names_in_ as well as get_feature_names_out for all samplers. #959 by Guillaume Lemaitre.

Deprecation

The parameter n_jobs has been deprecated from the classes ADASYN, BorderlineSMOTE, SMOTE, SMOTENC, SMOTEN, and SVMSMOTE. Instead, pass a nearest neighbors estimator where n_jobs is set. #887 by Guillaume Lemaitre.
The parameter base_estimator is deprecated and will be removed in version 0.12. It is impacted the following classes: BalancedBaggingClassifier, EasyEnsembleClassifier, RUSBoostClassifier. #946 by Guillaume Lemaitre.

Enhancements

Add support to accept compatible NearestNeighbors objects by only duck-typing. For instance, it allows to accept cuML instances. #858 by NV-jpt and Guillaume Lemaitre.

imbalanced-learn - Version 0.9.1

Published by glemaitre over 2 years ago

Compatibility with scikit-learn 1.1.0

imbalanced-learn - Version 0.9.0

Published by glemaitre almost 3 years ago

Compatibility with scikit-learn 1.0.2

imbalanced-learn - Version 0.8.1

Published by glemaitre about 3 years ago

Version 0.8.1

September 29, 2021

Maintenance

Make imbalanced-learn compatible with scikit-learn 1.0. #864 by Guillaume Lemaitre.

imbalanced-learn - Version 0.8.0

Published by glemaitre over 3 years ago

Version 0.8.0

February 18, 2021

Changelog

New features

Add the the function imblearn.metrics.macro_averaged_mean_absolute_error returning the average across class of the MAE. This metric is used in ordinal classification. #780 by Aurélien Massiot.
Add the class imblearn.metrics.pairwise.ValueDifferenceMetric to compute pairwise distances between samples containing only categorical values. #796 by Guillaume Lemaitre.
Add the class imblearn.over_sampling.SMOTEN to over-sample data only containing categorical features. #802 by Guillaume Lemaitre.
Add the possibility to pass any type of samplers in imblearn.ensemble.BalancedBaggingClassifier unlocking the implementation of methods based on resampled bagging. #808 by Guillaume Lemaitre.

Enhancements

Add option output_dict in imblearn.metrics.classification_report_imbalanced to return a dictionary instead of a string. #770 by Guillaume Lemaitre.
Added an option to generate smoothed bootstrap in `imblearn.over_sampling.RandomOverSampler. It is controled by the parameter shrinkage. This method is also known as Random Over-Sampling Examples (ROSE). #754 by Andrea Lorenzon and Guillaume Lemaitre.

Bug fixes

Fix a bug in imblearn.under_sampling.ClusterCentroids where voting="hard" could have lead to select a sample from any class instead of the targeted class. #769 by Guillaume Lemaitre.
Fix a bug in imblearn.FunctionSampler where validation was performed even with validate=False when calling fit. #790 by Guillaume Lemaitre.

Maintenance

Remove requirements files in favour of adding the packages in the extras_require within the setup.py file. #816 by Guillaume Lemaitre.
Change the website template to use pydata-sphinx-theme. #801 by Guillaume Lemaitre.

Deprecation

The context manager imblearn.utils.testing.warns is deprecated in 0.8 and will be removed 1.0. #815 by Guillaume Lemaitre.

imbalanced-learn - Version 0.7.0

Published by glemaitre over 4 years ago

A release to bump the minimum version of scikit-learn to 0.23 with a couple of bug fixes.
Check the what's new for more information.

imbalanced-learn - Version 0.6.2

Published by glemaitre over 4 years ago

This is a bug-fix release to resolve some issues regarding the handling the input and the output format of the arrays.

Changelog

Allow column vectors to be passed as targets. #673 by @chkoar.
Better input/output handling for pandas, numpy and plain lists. #681 by @chkoar.

imbalanced-learn - Version 0.6.1

Published by glemaitre almost 5 years ago

This is a bug-fix release to primarily resolve some packaging issues in version 0.6.0. It also includes minor documentation improvements and some bug fixes.

Changelog

Bug fixes

Fix a bug in :class:imblearn.ensemble.BalancedRandomForestClassifier leading to a wrong number of samples used during fitting due max_samples and therefore a bad computation of the OOB score. :pr:656 by :user:Guillaume Lemaitre <glemaitre>.

imbalanced-learn - Version 0.6.0

Published by glemaitre almost 5 years ago

Changelog

Changed models
..............

The following models might give some different sampling due to changes in
scikit-learn:

:class:imblearn.under_sampling.ClusterCentroids
:class:imblearn.under_sampling.InstanceHardnessThreshold

The following samplers will give different results due to change linked to
the random state internal usage:

:class:imblearn.over_sampling.SMOTENC

Bug fixes
.........

:class:imblearn.under_sampling.InstanceHardnessThreshold now take into
account the random_state and will give deterministic results. In addition,
cross_val_predict is used to take advantage of the parallelism.
:pr:599 by :user:Shihab Shahriar Khan <Shihab-Shahriar>.
Fix a bug in :class:imblearn.ensemble.BalancedRandomForestClassifier
leading to a wrong computation of the OOB score.
:pr:656 by :user:Guillaume Lemaitre <glemaitre>.

Maintenance
...........

Update imports from scikit-learn after that some modules have been privatize.
The following import have been changed:
:class:sklearn.ensemble._base._set_random_states,
:class:sklearn.ensemble._forest._parallel_build_trees,
:class:sklearn.metrics._classification._check_targets,
:class:sklearn.metrics._classification._prf_divide,
:class:sklearn.utils.Bunch,
:class:sklearn.utils._safe_indexing,
:class:sklearn.utils._testing.assert_allclose,
:class:sklearn.utils._testing.assert_array_equal,
:class:sklearn.utils._testing.SkipTest.
:pr:617 by :user:Guillaume Lemaitre <glemaitre>.
Synchronize :mod:imblearn.pipeline with :mod:sklearn.pipeline.
:pr:620 by :user:Guillaume Lemaitre <glemaitre>.
Synchronize :class:imblearn.ensemble.BalancedRandomForestClassifier and add
parameters max_samples and ccp_alpha.
:pr:621 by :user:Guillaume Lemaitre <glemaitre>.

Enhancement
...........

:class:imblearn.under_sampling.RandomUnderSampling,
:class:imblearn.over_sampling.RandomOverSampling,
:class:imblearn.datasets.make_imbalance accepts Pandas DataFrame in and
will output Pandas DataFrame. Similarly, it will accepts Pandas Series in and
will output Pandas Series.
:pr:636 by :user:Guillaume Lemaitre <glemaitre>.
:class:imblearn.FunctionSampler accepts a parameter validate allowing
to check or not the input X and y.
:pr:637 by :user:Guillaume Lemaitre <glemaitre>.
:class:imblearn.under_sampling.RandomUnderSampler,
:class:imblearn.over_sampling.RandomOverSampler can resample when non
finite values are present in X.
:pr:643 by :user:Guillaume Lemaitre <glemaitre>.
All samplers will output a Pandas DataFrame if a Pandas DataFrame was given
as an input.
:pr:644 by :user:Guillaume Lemaitre <glemaitre>.
The samples generation in
:class:imblearn.over_sampling.SMOTE,
:class:imblearn.over_sampling.BorderlineSMOTE,
:class:imblearn.over_sampling.SVMSMOTE,
:class:imblearn.over_sampling.KMeansSMOTE,
:class:imblearn.over_sampling.SMOTENC is now vectorize with giving
an additional speed-up when X in sparse.
:pr:596 by :user:Matt Eding <MattEding>.

Deprecation
...........

The following classes have been removed after 2 deprecation cycles:
ensemble.BalanceCascade and ensemble.EasyEnsemble.
:pr:617 by :user:Guillaume Lemaitre <glemaitre>.
The following functions have been removed after 2 deprecation cycles:
utils.check_ratio.
:pr:617 by :user:Guillaume Lemaitre <glemaitre>.
The parameter ratio and return_indices has been removed from all
samplers.
:pr:617 by :user:Guillaume Lemaitre <glemaitre>.
The parameters m_neighbors, out_step, kind, svm_estimator
have been removed from the :class:imblearn.over_sampling.SMOTE.
:pr:617 by :user:Guillaume Lemaitre <glemaitre>.

imbalanced-learn - 0.5.0

Published by glemaitre over 5 years ago

Version 0.5.0

Changed models

The following models or function might give different results even if the
same data X and y are the same.

:class:imblearn.ensemble.RUSBoostClassifier default estimator changed from
:class:sklearn.tree.DecisionTreeClassifier with full depth to a decision
stump (i.e., tree with max_depth=1).

Documentation

Correct the definition of the ratio when using a float in sampling
strategy for the over-sampling and under-sampling.
:issue:525 by :user:Ariel Rossanigo <arielrossanigo>.
Add :class:imblearn.over_sampling.BorderlineSMOTE and
:class:imblearn.over_sampling.SVMSMOTE in the API documenation.
:issue:530 by :user:Guillaume Lemaitre <glemaitre>.

Enhancement

Add Parallelisation for SMOTEENN and SMOTETomek.
:pr:547 by :user:Michael Hsieh <Microsheep>.
Add :class:imblearn.utils._show_versions. Updated the contribution guide
and issue template showing how to print system and dependency information
from the command line. :pr:557 by :user:Alexander L. Hayes <batflyer>.
Add :class:imblearn.over_sampling.KMeansSMOTE which is an over-sampler
clustering points before to apply SMOTE.
:pr:435 by :user:Stephan Heijl <StephanHeijl>.

Maintenance

Make it possible to import imblearn and access submodule.
:pr:500 by :user:Guillaume Lemaitre <glemaitre>.
Remove support for Python 2, remove deprecation warning from
scikit-learn 0.21.
:pr:576 by :user:Guillaume Lemaitre <glemaitre>.

Bug

Fix wrong usage of :class:keras.layers.BatchNormalization in
porto_seguro_keras_under_sampling.py example. The batch normalization
was moved before the activation function and the bias was removed from the
dense layer.
:pr:531 by :user:Guillaume Lemaitre <glemaitre>.
Fix bug which converting to COO format sparse when stacking the matrices in
:class:imblearn.over_sampling.SMOTENC. This bug was only old scipy version.
:pr:539 by :user:Guillaume Lemaitre <glemaitre>.
Fix bug in :class:imblearn.pipeline.Pipeline where None could be the final
estimator.
:pr:554 by :user:Oliver Rausch <orausch>.
Fix bug in :class:imblearn.over_sampling.SVMSMOTE and
:class:imblearn.over_sampling.BorderlineSMOTE where the default parameter
of n_neighbors was not set properly.
:pr:578 by :user:Guillaume Lemaitre <glemaitre>.
Fix bug by changing the default depth in
:class:imblearn.ensemble.RUSBoostClassifier to get a decision stump as a
weak learner as in the original paper.
:pr:545 by :user:Christos Aridas <chkoar>.
Allow to import keras directly from tensorflow in the
:mod:imblearn.keras.
:pr:531 by :user:Guillaume Lemaitre <glemaitre>.

imbalanced-learn - 0.4.3

Published by glemaitre almost 6 years ago

Mainly bugfix in SMOTE NC

imbalanced-learn - 0.4.2

Published by glemaitre almost 6 years ago

Version 0.4.2

Bug fixes

Fix a bug in imblearn.over_sampling.SMOTENC in which the the median of the standard deviation instead of half of the median of the standard deviation. By Guillaume Lemaitre in #491.
Raise an error when passing target which is not supported, i.e. regression target or multilabel targets. Imbalanced-learn does not support this case. By Guillaume Lemaitre in #490.

imbalanced-learn - 0.4.1

Published by glemaitre about 6 years ago

Version 0.4

October, 2018

Version 0.4 is the last version of imbalanced-learn to support Python 2.7
and Python 3.4. Imbalanced-learn 0.5 will require Python 3.5 or higher.

Highlights

This release brings its set of new feature as well as some API changes to
strengthen the foundation of imbalanced-learn.

As new feature, 2 new modules imblearn.keras and
imblearn.tensorflow have been added in which imbalanced-learn samplers
can be used to generate balanced mini-batches.

The module imblearn.ensemble has been consolidated with new classifier:
imblearn.ensemble.BalancedRandomForestClassifier,
imblearn.ensemble.EasyEnsembleClassifier,
imblearn.ensemble.RUSBoostClassifier.

Support for string has been added in
imblearn.over_sampling.RandomOverSampler and
imblearn.under_sampling.RandomUnderSampler. In addition, a new class
imblearn.over_sampling.SMOTENC allows to generate sample with data
sets containing both continuous and categorical features.

The imblearn.over_sampling.SMOTE has been simplified and break down
to 2 additional classes:
imblearn.over_sampling.SVMSMOTE and
imblearn.over_sampling.BorderlineSMOTE.

There is also some changes regarding the API:
the parameter sampling_strategy has been introduced to replace the
ratio parameter. In addition, the return_indices argument has been
deprecated and all samplers will exposed a sample_indices_ whenever this is
possible.

imbalanced-learn - 0.4.0

Published by glemaitre about 6 years ago

Version 0.4

October, 2018

.. warning::

Version 0.4 is the last version of imbalanced-learn to support Python 2.7
and Python 3.4. Imbalanced-learn 0.5 will require Python 3.5 or higher.

Highlights

This release brings its set of new feature as well as some API changes to
strengthen the foundation of imbalanced-learn.

As new feature, 2 new modules imblearn.keras and
imblearn.tensorflow have been added in which imbalanced-learn samplers
can be used to generate balanced mini-batches.

The imblearn.over_sampling.SMOTE has been simplified and break down
to 2 additional classes:
imblearn.over_sampling.SVMSMOTE and
imblearn.over_sampling.BorderlineSMOTE.

Package Rankings

Top 0.56% on Pypi.org

Top 15.37% on Spack.io

Top 4.1% on Alpine-v3.17

Top 19.71% on Anaconda.org

Top 4.69% on Conda-forge.org

Related Projects

skll

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

02 Aug 2013 551

imodels

Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-co...

04 Jul 2019 1,372

scikit-activeml

scikit-activeml: Python library for active learning on top of scikit-learn

29 Jul 2020 148

machine-learning-imbalanced-data

Code repository for the online course Machine Learning with Imbalanced Data

13 Oct 2020 156

ml-research

A Python library with utilities for Machine Learning research and algorithm implementations

14 Feb 2022 5

opencv-machine-learning

M. Beyeler (2017). Machine Learning for OpenCV: Intelligent image processing with Python. Packt P...

17 Feb 2017 799

AIF360

A comprehensive set of fairness metrics for datasets and machine learning models, explanations fo...

22 Aug 2018 2,417

imbalanced-ensemble

🛠️ Class-imbalanced Ensemble Learning Toolbox. | 类别不平衡/长尾机器学习库

10 Mar 2021 321

self-paced-ensemble

[ICDE'20] ⚖️ A general, efficient ensemble framework for imbalanced classification. | 泛用，高效，鲁棒的类别...

05 Sep 2019 249

Sklearn-genetic-opt

ML hyperparameters tuning and features selection, using evolutionary algorithms.

18 Jan 2020 307

metric-learn

Metric learning algorithms in Python

02 Nov 2013 1,383

ml_tutor

Machine Learning Tutor Python library

10 Sep 2020 24

hyperimpute

A framework for prototyping and benchmarking imputation methods

16 Dec 2021 159

python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource

07 Aug 2015 12,234

scikit-learn

scikit-learn: machine learning in Python

17 Aug 2010 57,979