Probabilistic classification in PyTorch/TensorFlow/scikit-learn with Fenchel-Young losses
.. -- mode: rst --
This package implements loss functions useful for probabilistic classification. More specifically, it provides
The package is based on the Fenchel-Young loss framework [1,2,3].
.. image:: examples/tsallis.png :alt: Tsallis losses :align: center
Notice from the center plot that sparsemax and Tsallis are able to produce exactly zero (sparse) probabilities unlike the logistic (softmax) loss.
Sparse means that some classes have exactly zero probability, i.e., these classes are irrelevant.
Tsallis losses are a family of losses parametrized by a positive real value α. They recover the multinomial logistic loss with α=1 and the sparsemax loss with α=2. Values of α between 1 and 2 enable to interpolate between the two losses.
In all losses above, the ground-truth can either be a n_samples 1d-array of label integers (each label should be between 0 and n_classes-1) or a n_samples x n_classes 2d-array of label proportions (each row should sum to 1).
scikit-learn compatible classifier:
.. code-block:: python
import numpy as np from sklearn.datasets import make_classification from fyl_sklearn import FYClassifier
X, y = make_classification(n_samples=10, n_features=5, n_informative=3, n_classes=3, random_state=0) clf = FYClassifier(loss="sparsemax") clf.fit(X, y) print(clf.predict_proba(X[:3]))
Drop-in replacement for PyTorch losses:
.. code-block:: python
import torch from fyl_pytorch import SparsemaxLoss
y_true = torch.tensor([0, 2])
theta = torch.tensor([[-2.5, 1.2, 0.5], [2.2, 0.8, -1.5]]) loss = SparsemaxLoss()
print(loss(theta, y_true))
print(loss.y_pred)
print(loss.predict(theta))
y_true = torch.tensor([[0.8, 0.2, 0], [0.1, 0.2, 0.7]]) print(loss(theta, y_true))
Drop-in replacement for tensorflow losses:
.. code-block:: python
import tensorflow as tf from fyl_tensorflow import sparsemax_loss, sparsemax_predict
y_true = tf.constant([0, 2])
theta = tf.constant([[-2.5, 1.2, 0.5], [2.2, 0.8, -1.5]])
print(sparsemax_loss(y_true, theta))
print(sparsemax_predict(theta))
y_true = tf.constant([[0.8, 0.2, 0], [0.1, 0.2, 0.7]]) print(sparsemax_loss(y_true, theta))
Simply copy relevant files to your project.
The TensorFlow implementation requires the installation of TensorFlow Addons <https://github.com/tensorflow/addons>
_.
.. [1] SparseMAP: Differentiable Sparse Structured Inference.
Vlad Niculae, André F. T. Martins, Mathieu Blondel, Claire Cardie.
In Proc. of ICML 2018.
[arXiv <https://arxiv.org/abs/1802.04223>
_]
.. [2] Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms.
Mathieu Blondel, André F. T. Martins, Vlad Niculae.
In Proc. of AISTATS 2019.
[arXiv <https://arxiv.org/abs/1805.09717>
_]
.. [3] Learning with Fenchel-Young Losses.
Mathieu Blondel, André F. T. Martins, Vlad Niculae.
Preprint.
[arXiv <https://arxiv.org/abs/1901.02324>
_]