Generalized Linear Regressions Models (penalized regressions, robust regressions, ...)
MIT License
[Linux] | Coverage | Documentation |
---|---|---|
This is a package gathering functionalities to solve a number of generalised linear regression/classification problems which, inherently, correspond to an optimisation problem of the form
$$ L(y, X\theta) + P(\theta) $$
where:
Additional regression/classification methods which do not directly correspond to this formulation may be added in the future.
The core aims of this package are:
MLJ.jl
,Optim.jl
, IterativeSolvers.jl
,Head to the quickstart section of the docs to see how to use this package.
This section is only useful if you're interested in implementation details or would like to help extend the library. For usage instruction please head to the docs.
Regressors | Formulation¹ | Available solvers | Comments |
---|---|---|---|
OLS & Ridge | L2Loss + 0/L2 | Analytical² or CG³ | |
Lasso & Elastic-Net | L2Loss + 0/L2 + L1 | (F)ISTA⁴ | |
Robust 0/L2 | RobustLoss⁵ + 0/L2 | Newton, NewtonCG, LBFGS, IWLS-CG⁶ | no scale⁷ |
Robust L1/EN | RobustLoss + 0/L2 + L1 | (F)ISTA | |
Quantile⁸ + 0/L2 | RobustLoss + 0/L2 | LBFGS, IWLS-CG | |
Quantile L1/EN | RobustLoss + 0/L2 + L1 | (F)ISTA |
\
solver,δ=0.5
.Classifiers | Formulation | Available solvers | Comments |
---|---|---|---|
Logistic 0/L2 | LogisticLoss + 0/L2 | Newton, Newton-CG, LBFGS | yᵢ∈{±1} |
Logistic L1/EN | LogisticLoss + 0/L2 + L1 | (F)ISTA | yᵢ∈{±1} |
Multinomial 0/L2 | MultinomialLoss + 0/L2 | Newton-CG, LBFGS | yᵢ∈{1,...,c} |
Multinomial L1/EN | MultinomialLoss + 0/L2 + L1 | ISTA, FISTA | yᵢ∈{1,...,c} |
Unless otherwise specified:
Optim.jl
)β=0.8
Note: these models were all tested for correctness whenever a direct comparison with another package was possible, usually by comparing the objective function at the coefficients returned (cf. the tests):
Systematic timing benchmarks have not been run yet but it's planned (see this issue).
n > p
; if this doesn't hold, tricks should be employed to speed up computations; these have not been implemented yet.Model | Formulation | Comments |
---|---|---|
Group Lasso | L2Loss + ∑L1 over groups | ⭒ |
Adaptive Lasso | L2Loss + weighted L1 | ⭒ A |
SCAD | L2Loss + SCAD | A, B, C |
MCP | L2Loss + MCP | A |
OMP | L2Loss + L0Loss | D |
SGD Classifiers | *Loss + No/L2/L1 and OVA | SkL |
There are a number of other regression models that may be included in this package in the longer term but may not directly correspond to the paradigm Loss+Penalty
introduced earlier.
In some cases it will make more sense to just use GLM.jl.
Sklearn's list: https://scikit-learn.org/stable/supervised_learning.html#supervised-learning
Model | Note | Link(s) |
---|---|---|
LARS | -- | |
Quantile Regression | -- | Yang et al, 2013, QuantileRegression.jl |
L∞ approx (Logsumexp) | -- | slides |
Passive Agressive | -- | Crammer et al, 2006 SkL |
Orthogonal Matching Pursuit | -- | SkL |
Least Median of Squares | -- | Rousseeuw, 1984 |
RANSAC, Theil-Sen | Robust reg | Overview RANSAC, SkL, SkL, More Ransac |
Ordinal regression | need to figure out how they work | E |
Count regression | need to figure out how they work | R |
Robust M estimators | -- | F |
Perceptron, MIRA classifier | Sklearn just does OVA with binary in SGDClassif | H |
Robust PTS and LTS | -- | PTS LTS |
While the functionalities in this package overlap with a number of existing packages, the hope is that this package will offer a general entry point for all of them in a way that won't require too much thinking from an end user (similar to how someone would use the tools from sklearn.linear_model
).
If you're looking for specific functionalities/algorithms, it's probably a good idea to look at one of the packages below:
There's also GLM.jl which is more geared towards statistical analysis for reasonably-sized datasets and does (as far as I'm aware) lack a few key functionalities for ML such as penalised regressions or multinomial regression.