Fast and Accurate ML in 3 Lines of Code
APACHE-2.0 License
Bot releases are hidden (Show)
v0.8.3 is a patch release to address security vulnerabilities.
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/0.8.2...0.8.3
This version supports Python versions 3.8, 3.9, and 3.10.
transformers
and other packages version upgrades + some fixes: @suzhoum (#4155)Published by Innixma 6 months ago
We're happy to announce the AutoGluon 1.1 release.
AutoGluon 1.1 contains major improvements to the TimeSeries module, achieving a 60% win-rate vs AutoGluon 1.0 through the addition of Chronos, a pretrained model for time series forecasting, along with numerous other enhancements. The other modules have also been enhanced through new features such as Conv-LORA support and improved performance for large tabular datasets between 5 - 30 GB in size. For a full breakdown of AutoGluon 1.1 features, please refer to the feature spotlights and the itemized enhancements below.
Join the community:
Get the latest updates:
This release supports Python versions 3.8, 3.9, 3.10, and 3.11. Loading models trained on older versions of AutoGluon is not supported. Please re-train models using AutoGluon 1.1.
This release contains 125 commits from 20 contributors!
Full Contributor List (ordered by # of commits):
@shchur @prateekdesai04 @Innixma @canerturkmen @zhiqiangdon @tonyhoo @AnirudhDagar @Harry-zzh @suzhoum @FANGAreNotGnu @nimasteryang @lostella @dassaswat @afmkt @npepin-hub @mglowacki100 @ddelange @LennartPurucker @taoyang1122 @gradientsky
Special thanks to @ddelange for their continued assistance with Python 3.11 support and Ray version upgrades!
AutoGluon has experienced wide-spread adoption on Kaggle since the AutoGluon 1.0 release.
AutoGluon has been used in over 130 Kaggle notebooks and mentioned in over 100 discussion threads in the past 90 days!
Most excitingly, AutoGluon has already been used to achieve top ranking placements in multiple competitions with thousands of competitors since the start of 2024:
Placement | Competition | Author | Date | AutoGluon Details | Notes |
---|---|---|---|---|---|
🥉 Rank 3/2303 (Top 0.1%) | Steel Plate Defect Prediction | Samvel Kocharyan | 2024/03/31 | v1.0, Tabular | Kaggle Playground Series S4E3 |
🥈 Rank 2/93 (Top 2%) | Prediction Interval Competition I: Birth Weight | Oleksandr Shchur | 2024/03/21 | v1.0, Tabular | |
🥈 Rank 2/1542 (Top 0.1%) | WiDS Datathon 2024 Challenge #1 | lazy_panda | 2024/03/01 | v1.0, Tabular | |
🥈 Rank 2/3746 (Top 0.1%) | Multi-Class Prediction of Obesity Risk | Kirderf | 2024/02/29 | v1.0, Tabular | Kaggle Playground Series S4E2 |
🥈 Rank 2/3777 (Top 0.1%) | Binary Classification with a Bank Churn Dataset | lukaszl | 2024/01/31 | v1.0, Tabular | Kaggle Playground Series S4E1 |
Rank 4/1718 (Top 0.2%) | Multi-Class Prediction of Cirrhosis Outcomes | Kirderf | 2024/01/01 | v1.0, Tabular | Kaggle Playground Series S3E26 |
We are thrilled that the data science community is leveraging AutoGluon as their go-to method to quickly and effectively achieve top-ranking ML solutions! For an up-to-date list of competition solutions using AutoGluon refer to our AWESOME.md, and don't hesitate to let us know if you use AutoGluon in a competition!
AutoGluon-TimeSeries now features Chronos, a family of forecasting models pretrained on large collections of open-source time series datasets that can generate accurate zero-shot predictions for new unseen data. Check out the new tutorial to learn how to use Chronos through the familiar TimeSeriesPredictor
API.
AutoGluon 1.1 comes with numerous new features and improvements to the time series module. These include highly requested functionality such as feature importance, support for categorical covariates, ability to visualize forecasts, and enhancements to logging. The new release also comes with considerable improvements to forecast accuracy, achieving 60% win rate and 3% average error reduction compared to the previous AutoGluon version. These improvements are mostly attributed to the addition of Chronos, improved preprocessing logic, and native handling of missing values.
TimeSeriesPredictor.feature_importance()
. @canerturkmen (#4033, #4087)TimeSeriesPredictor.persist()
. @canerturkmen (#4005)TimeSeriesPredictor.plot()
. @shchur (#3889)RMSLE
evaluation metric. @canerturkmen (#3938)keep_lightning_logs
hyperparameter. @shchur (#3937)AutoMM 1.1 introduces the innovative Conv-LoRA, a parameter-efficient fine-tuning (PEFT) method stemming from our latest paper presented at ICLR 2024, titled "Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model". Conv-LoRA is designed for fine-tuning the Segment Anything Model, exhibiting superior performance compared to previous PEFT approaches, such as LoRA and visual prompt tuning, across various semantic segmentation tasks in diverse domains including natural images, agriculture, remote sensing, and healthcare. Check out our Conv-LoRA example.
AutoGluon-Tabular 1.1 primarily focuses on bug fixes and stability improvements. In particular, we have greatly improved the runtime performance for large datasets between 5 - 30 GB in size through the usage of subsampling for decision threshold calibration and the weighted ensemble fitting to 1 million rows, maintaining the same quality while being far faster to execute. We also adjusted the default weighted ensemble iterations from 100 to 25, which will speedup all weighted ensemble fit times by 4x. We heavily refactored the fit_pseudolabel
logic, and it should now achieve noticeably stronger results.
predictor.fit_weighted_ensemble(refit_full=True)
. @Innixma (#1956).fit_pseudolabel
logic. @Innixma (#3930)Published by Innixma 11 months ago
Today is finally the day... AutoGluon 1.0 has arrived!! After over four years of development and 2061 commits from 111 contributors, we are excited to share with you the culmination of our efforts to create and democratize the most powerful, easy to use, and feature rich automated machine learning system in the world. AutoGluon 1.0 comes with transformative enhancements to predictive quality resulting from the combination of multiple novel ensembling innovations, spotlighted below. Besides performance enhancements, many other improvements have been made that are detailed in the individual module sections.
This release supports Python versions 3.8, 3.9, 3.10, and 3.11. Loading models trained on older versions of AutoGluon is not supported. Please re-train models using AutoGluon 1.0.
This release contains 223 commits from 17 contributors!
Special thanks to @LennartPurucker for leading development of dynamic stacking, @geoalgo for co-authoring TabRepo to enable Zeroshot-HPO, @ddelange for helping to add Python 3.11 support, and @mglowacki100 for providing numerous feedback and suggestions.
Full Contributor List (ordered by # of commits):
@shchur, @zhiqiangdon, @Innixma, @prateekdesai04, @FANGAreNotGnu, @yinweisu, @taoyang1122, @LennartPurucker, @Harry-zzh, @AnirudhDagar, @jaheba, @gradientsky, @melopeo, @ddelange, @tonyhoo, @canerturkmen, @suzhoum
Join the community:
Get the latest updates:
AutoGluon 1.0 features major enhancements to predictive quality, establishing a new state-of-the-art in Tabular modeling. To the best of our knowledge, AutoGluon 1.0 marks the largest leap forward in the state-of-the-art for tabular data since the original AutoGluon paper from March 2020. The enhancements come primarily from two features: Dynamic stacking to mitigate stacked overfitting, and a new learned model hyperparameters portfolio via Zeroshot-HPO, obtained from the newly released TabRepo ensemble simulation library. Together, they lead to a 75% win-rate compared to AutoGluon 0.8 with faster inference speed, lower disk usage, and higher stability.
OpenML released the official 2023 AutoML Benchmark results on November 16th, 2023. Their results show AutoGluon 0.8 as the state-of-the-art in AutoML systems across a wide variety of tasks: "Overall, in terms of model performance, AutoGluon consistently has the highest average rank in our benchmark." We now showcase that AutoGluon 1.0 achieves far superior results even to AutoGluon 0.8!
Below is a comparison on the OpenML AutoML Benchmark across 1040 tasks. LightGBM, XGBoost, and CatBoost results were obtained via AutoGluon, and other methods are from the official AutoML Benchmark 2023 results. AutoGluon 1.0 has a 95%+ win-rate against traditional tabular models, including a 99% win-rate vs LightGBM and a 100% win-rate vs XGBoost. AutoGluon 1.0 has between an 82% and 94% win-rate against other AutoML systems. For all methods, AutoGluon is able to achieve >10% average loss improvement (Ex: Going from 90% accuracy to 91% accuracy is a 10% loss improvement). AutoGluon 1.0 achieves first place in 63% of tasks, with lightautoml having the second most at 12% (AutoGluon 0.8 previously took first place 48% of the time). AutoGluon 1.0 even achieves a 7.4% average loss improvement over AutoGluon 0.8!
Method | AG Winrate | AG Loss Improvement | Rescaled Loss | Rank | Champion |
---|---|---|---|---|---|
AutoGluon 1.0 (Best, 4h8c) | - | - | 0.04 | 1.95 | 63% |
lightautoml (2023, 4h8c) | 84% | 12.0% | 0.2 | 4.78 | 12% |
H2OAutoML (2023, 4h8c) | 94% | 10.8% | 0.17 | 4.98 | 1% |
FLAML (2023, 4h8c) | 86% | 16.7% | 0.23 | 5.29 | 5% |
MLJAR (2023, 4h8c) | 82% | 23.0% | 0.33 | 5.53 | 6% |
autosklearn (2023, 4h8c) | 91% | 12.5% | 0.22 | 6.07 | 4% |
GAMA (2023, 4h8c) | 86% | 15.4% | 0.28 | 6.13 | 5% |
CatBoost (2023, 4h8c) | 95% | 18.2% | 0.28 | 6.89 | 3% |
TPOT (2023, 4h8c) | 91% | 23.1% | 0.4 | 8.15 | 1% |
LightGBM (2023, 4h8c) | 99% | 23.6% | 0.4 | 8.95 | 0% |
XGBoost (2023, 4h8c) | 100% | 24.1% | 0.43 | 9.5 | 0% |
RandomForest (2023, 4h8c) | 97% | 25.1% | 0.53 | 9.78 | 1% |
Not only is AutoGluon more accurate in 1.0, it is also more stable thanks to our new usage of Ray subprocesses during low-memory training, resulting in 0 task failures on the AutoML Benchmark.
AutoGluon 1.0 is capable of achieving the fastest inference throughput of any AutoML system while still obtaining state-of-the-art results. By specifying the infer_limit
fit argument, users can trade off between accuracy and inference speed to meet their needs.
As seen in the below plot, AutoGluon 1.0 sets the Pareto Frontier for quality and inference throughput, achieving Pareto Dominance compared to all other AutoML systems. AutoGluon 1.0 High achieves superior performance to AutoGluon 0.8 Best with 8x faster inference and 8x less disk usage!
You can get more details on the results here.
We would like to conclude this spotlight by thanking Pieter Gijsbers, Sébastien Poirier, Erin LeDell, Joaquin Vanschoren, and the rest of the AutoML Benchmark authors for their key role in providing a shared and extensive benchmark to monitor the progress of the AutoML field. Their support has been invaluable to the AutoGluon project's continued growth.
We would also like to thank Frank Hutter, who continues to be a leader within the AutoML field, for organizing the AutoML Conference in 2022 and 2023 to bring the community together to share ideas and align on a compelling vision.
We are excited to see what our users can accomplish with AutoGluon 1.0's enhanced performance. As always, we will continue to improve AutoGluon in future releases to push the boundaries of AutoML forward for all.
We have published a paper on AutoGluon-TimeSeries at AutoML Conference 2023 (YouTube Video). In the paper, we benchmarked AutoGluon and popular open-source forecasting frameworks (including DeepAR, TFT, AutoARIMA, AutoETS, AutoPyTorch). AutoGluon produces SOTA results in point and probabilistic forecasting, and even achieves 65% win rate against the best-in-hindsight combination of models.
We have published a paper on Tabular Zeroshot-HPO ensembling simulation to arXiv (Paper Link, GitHub). This paper is key to achieving the performance improvements seen in AutoGluon 1.0, and we plan to continue to develop the code-base to support future enhancements.
We have published a paper on tabular Transformer pre-training at ICML 2023 (Paper Link, GitHub). In the paper we demonstrate state-of-the-art performance for tabular deep learning models, including being able to match the performance of XGBoost and LightGBM models. While the pre-trained transformer is not yet incorporated into AutoGluon, we plan to integrate it in a future release.
Our paper on learning multimodal data augmentation was accepted at ICLR 2023 (Paper Link, GitHub). This paper introduces a plug-and-play module to learn multimodal data augmentation in feature space, with no constraints on the identities of the modalities or the relationship between modalities. We show that it can (1) improve the performance of multimodal deep learning architectures, (2) apply to combinations of modalities that have not been previously considered, and (3) achieve state-of-the-art results on a wide range of applications comprised of image, text, and tabular data. This work is not yet incorporated into AutoGluon, but we plan to integrate it in a future release.
Our paper on generative object detection data augmentation has been accepted at WACV 2024 (Paper and Github link will be available soon). This paper proposes a data augmentation pipeline based on controllable diffusion models and CLIP, with visual prior generation to guide the generation and post-filtering by category-calibrated CLIP scores to control its quality. We demonstrate that the performance improves across various tasks and settings when using our augmentation pipeline with different detectors. Although diffusion models are currently not integrated into AutoGluon, we plan to incorporate the data augmentation techniques in a future release.
We have published a paper on how to efficiently adapt image foundation models for video understanding at ICLR 2023 (Paper Link, GitHub). This paper introduces spatial adaptation, temporal adaptation and joint adaptation to gradually equip a frozen image model with spatiotemporal reasoning capability. The proposed method achieves competitive or even better performance than traditional full finetuning while largely saving the training cost of large foundation models.
>=2.0,<2.2
@zhiqiangdon @yinweisu @shchur (#3404, #3587, #3588)>=1.21,<1.29
@prateekdesai04 (#3709)>=2.0,<2.2
@yinweisu @tonyhoo @shchur (#3498)>=1.3,<1.5
@yinweisu @tonyhoo @shchur (#3498)>=10.0.1,<11
@jaheba (#3688)>=1.5.4,<1.13
@prateekdesai04 (#3709)>=3.3,<4.2
@mglowacki100 @prateekdesai04 @Innixma (#3427, #3709, #3733)>=1.6,<2.1
@Innixma (#3768)AutoGluon 1.0 features major enhancements to predictive quality, establishing a new state-of-the-art in Tabular modeling. Refer to the spotlight section above for more details!
dynamic_stacking
predictor fit argument to mitigate stacked overfitting @LennartPurucker @Innixma (#3616)best_quality
and high_quality
presets. @Innixma @geoalgo (#3750)from autogluon.tabular.experimental import TabularClassifier, TabularRegressor
. @Innixma (#3769)predictor.model_failures()
@Innixma (#3421)predictor.simulation_artifact()
to support integration with TabRepo @Innixma (#3555)infer_limit
being used incorrectly when bagging @Innixma (#3467)AutoGluon MultiModal (AutoMM) is designed to simplify the fine-tuning of foundation models for downstream applications with just three lines of code. It seamlessly integrates with popular model zoos such as HuggingFace Transformers, TIMM, and MMDetection, providing support for a diverse range of data modalities, including image, text, tabular, and document data, whether used individually or in combination.
semantic_segmentation
, for fine-tuning Segment Anything Model (SAM) with three lines of code. @Harry-zzh @zhiqiangdon (#3645, #3677, #3697, #3711, #3722, #3728)few_shot_classification
problem type for training few shot classifiers on images or texts. @zhiqiangdon (#3662, #3681, #3695)eval_metric
argument. @taoyang1122 (#3548)hf_text.use_fast
for customizing fast tokenizer usage in hf_text
models. @zhiqiangdon (#3379)f1_macro
f1_micro
, and f1_weighted
. @FANGAreNotGnu (#3696)AutoGluon 1.0 features numerous usability and performance improvements to the TimeSeries module. These include automatic handling of missing data and irregular time series, new forecasting metrics (including custom metric support), advanced time series cross-validation options, and new forecasting models.
WAPE
, RMSSE
, SQL
+ improved documentation for metrics @melopeo @shchur (#3747, #3632, #3510, #3490)TimeSeriesPredictor
can now handle data with all pandas frequencies, irregular timestamps, or missing values represented by NaN
@shchur (#3563, #3454)ADIDA
, CrostonClassic
, CrostonOptimized
, CrostonSBA
, IMAPA
); WaveNet
and NPTS
from GluonTS; new baseline models (Average
, SeasonalAverage
, Zero
) @canerturkmen @shchur (#3706, #3742, #3606, #3459)refit_every_n_windows
or adjust the step size between validation windows with val_step_size
arguments to TimeSeriesPredictor.fit
@shchur (#3704, #3537)TimeSeriesPredictor.evaluate
@shchur (#3646)TimeSeriesDataFrame.from_path
and TimeSeriesDataFrame.from_data_frame
constructors @shchur (#3635)DirectTabular
and RecursiveTabular
models (#3740, #3620, #3559)autogluon.timeseries
by moving import statements inside model classes (#3514)TimeSeriesPredictor
with TabularPredictor
, remove deprecated methods @shchur (#3714, #3655, #3396)The EDA module will be released at a later time, as it requires additional development effort before it is ready for 1.0. We will make an announcement when EDA is ready for release. For now, please continue to use "autogluon.eda==0.8.2"
.
autogluon.core.spaces
has been deprecated. Please use autogluon.common.spaces
instead @Innixma (#3701)Tabular will log warnings if using the deprecated methods. Deprecated methods are planned to be removed in AutoGluon 1.2 @Innixma (#3701)
autogluon.tabular.TabularPredictor
predictor.get_model_names()
-> predictor.model_names()
predictor.get_model_names_persisted()
-> predictor.model_names(persisted=True)
predictor.compile_models()
-> predictor.compile()
predictor.persist_models()
-> predictor.persist()
predictor.unpersist_models()
-> predictor.unpersist()
predictor.get_model_best()
-> predictor.model_best
predictor.get_pred_from_proba()
-> predictor.predict_from_proba()
predictor.get_oof_pred_proba()
-> predictor.predict_proba_oof()
predictor.get_oof_pred()
-> predictor.predict_oof()
predictor.get_model_full_dict()
-> predictor.model_refit_map()
predictor.get_size_disk()
-> predictor.disk_usage()
predictor.get_size_disk_per_file()
-> predictor.disk_usage_per_file()
predictor.leaderboard()
silent
argument deprecated, replaced by display
, defaults to False
predictor.evaluate()
and predictor.evaluate_predictions()
FewShotSVMPredictor
in favor of the new few_shot_classification
problem type @zhiqiangdon (#3699)AutoMMPredictor
in favor of MultiModalPredictor
@zhiqiangdon (#3650)autogluon.multimodal.MultiModalPredictor
config
argument in the fit API. @zhiqiangdon (#3679)init_scratch
and pipeline
arguments in the init API @zhiqiangdon (#3668)autogluon.timeseries.TimeSeriesPredictor
TimeSeriesPredictor(ignore_time_index: bool)
. Now, if the data contains irregular timestamps, either convert it to regular frequency with data = data.convert_frequency(freq)
or provide frequency when creating the predictor as TimeSeriesPredictor(freq=freq)
.predictor.evaluate()
now returns a dictionary (previously returned a float)predictor.score()
-> predictor.evaluate()
predictor.get_model_names()
-> predictor.model_names()
predictor.get_model_best()
-> predictor.model_best
"mean_wQuantileLoss"
has been renamed to "WQL"
predictor.leaderboard()
silent
argument deprecated, replaced by display
, defaults to Falsehyperparameters
to a string in predictor.fit()
, supported values are now "default"
, "light"
and "very_light"
autogluon.timeseries.TimeSeriesDataFrame
df.to_regular_index()
-> df.convert_frequency()
df.get_reindexed_view()
. Please see deprecation notes for ignore_time_index
under TimeSeriesPredictor
above for information on how to deal with irregular timestampsDeepARMXNet
, MQCNNMXNet
, MQRNNMXNet
, SimpleFeedForwardMXNet
, TemporalFusionTransformerMXNet
, TransformerMXNet
) have been removedARIMA
, Theta
, ETS
) have been replaced by their counterparts from StatsForecast (#3513). Note that these models now have different hyperparameter names.DirectTabular
is now implemented using mlforecast
backend (same as RecursiveTabular
), most hyperparameter names for the model have changed.autogluon.timeseries.TimeSeriesEvaluator
has been deprecated. Please use metrics available in autogluon.timeseries.metrics
instead.autogluon.timeseries.splitter.MultiWindowSplitter
and autogluon.timeseries.splitter.LastWindowSplitter
have been deprecated. Please use num_val_windows
and val_step_size
arguments to TimeSeriesPredictor.fit
instead (alternatively, use autogluon.timeseries.splitter.ExpandingWindowSplitter
).Published by yinweisu over 1 year ago
v0.8.2 is a hot-fix release to pin pydantic
version to avoid crashing during HPO
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/0.8.1...0.8.2
This version supports Python versions 3.8, 3.9, and 3.10.
Published by yinweisu over 1 year ago
v0.8.1 is a bug fix release.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/0.8.0...0.8.1
This version supports Python versions 3.8, 3.9, and 3.10.
DirectTabular
model failing for some metrics; hide warnings produced by AutoARIMA
@shchur (#3350)Published by yinweisu over 1 year ago
We're happy to announce the AutoGluon 0.8 release.
NEW: Join our official community discord server to ask questions and get involved!
Note: Loading models trained in different versions of AutoGluon is not supported.
This release contains 196 commits from 20 contributors!
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/0.7.0...0.8.0
Special thanks to @geoalgo for the joint work in generating the experimental tabular Zeroshot-HPO portfolio this release!
Full Contributor List (ordered by # of commits):
@shchur, @Innixma, @yinweisu, @gradientsky, @FANGAreNotGnu, @zhiqiangdon, @gidler, @liangfu, @tonyhoo, @cheungdaven, @cnpgs, @giswqs, @suzhoum, @yongxinw, @isunli, @jjaeyeon, @xiaochenbin9527, @yzhliu, @jsharpna, @sxjscience
AutoGluon 0.8 supports Python versions 3.8, 3.9, and 3.10.
f1
and balanced_accuracy
. It is not uncommon to see f1
scores improve from 0.70
to 0.73
as an example. We strongly encourage all users who are using these metrics to try out the new decision threshold calibration logic.medium_quality
, high_quality
, and best_quality
options. The empirical results demonstrate significant ~20% relative improvements in the mAP (mean Average Precision) metric, using the same preset.best_quality
). To try it out, specify presets="experimental_zeroshot_hpo_hybrid"
when calling fit()
.pip install autogluon.tabular[all,tabpfn]
(hyperparameter key is "TABPFN")! You can also try it out via specifying presets="experimental_extreme_quality"
.AutoGluon MultiModal (also known as AutoMM) introduces two new features: 1) PDF document classification, and 2) Open Vocabulary Object Detection. Additionally, we have upgraded the presets for object detection, now offering medium_quality
, high_quality
, and best_quality
options. The empirical results demonstrate significant ~20% relative improvements in the mAP (mean Average Precision) metric, using the same preset.
medium_quality
: yolo-s -> yolox-lhigh_quality
: yolox-l -> DINO-Res50best_quality
: yolox-x -> DINO-Swin_ltrainable_parameters
returns the number of trainable parameters.total_parameters
returns the number of total parameters.model_size
returns the model size measured by megabytes.calibrate_decision_threshold
(tutorial), which allows to optimize a given metric's decision threshold for predictions to strongly enhance the metric score. @Innixma (#3298)presets="experimental_zeroshot_hpo_hybrid"
when calling fit()
@Innixma @geoalgo (#3312)pip install autogluon.tabular[all,tabpfn]
! @Innixma (#3270)included_model_types
@yinweisu (#3239)pred_proba
@Innixma (#3240)In v0.8 we introduce several major improvements to the Time Series module, including new models, upgraded presets that lead to better forecast accuracy, and optimizations that speed up training & inference.
PatchTST
and DLinear
from GluonTS, and RecursiveTabular
based on integration with the mlforecast
library @shchur (#3177, #3184, #3230)AutoARIMA
, AutoETS
, Theta
, DirectTabular
, WeightedEnsemble
models @shchur (#3062, #3214, #3252)predict()
, leaderboard()
and evaluate()
thanks to prediction caching @shchur (#3237)num_val_windows
argument to fit()
@shchur (#3080)excluded_model_types
argument to fit()
@shchur (#3231)refit_full()
that refits models on combined train and validation data @shchur (#3157)hyperparameters
argument @shchur (#3183)time_limit
is now respected by all models @shchur (#3214)DirectTabular
model (previously called AutoGluonTabular
): faster featurization, trained as a quantile regression model if eval_metric
is set to "mean_wQuantileLoss"
@shchur (#2973, #3211)TimeSeriesPredictor
from disk @shchur (#3233)In 0.8 we introduce a few new tools to help with data exploration and feature engineering:
quick_fit
to use residuals plot @gradientsky (#3039)explain_rows
method to autogluon.eda.auto
- Kernel SHAP visualization @gradientsky (#3014)Published by tonyhoo over 1 year ago
We're happy to announce the AutoGluon 0.7 release. This release contains a new experimental module autogluon.eda
for exploratory
data analysis. AutoGluon 0.7 offers conda-forge support, enhancements to Tabular, MultiModal, and Time Series
modules, and many quality of life improvements and fixes.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
This release contains 170 commits from 19 contributors!
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/v0.6.2...v0.7.0
Special thanks to @MountPOTATO who is a first time contributor to AutoGluon this release!
Full Contributor List (ordered by # of commits):
@Innixma, @zhiqiangdon, @yinweisu, @gradientsky, @shchur, @sxjscience, @FANGAreNotGnu, @yongxinw, @cheungdaven,
@liangfu, @tonyhoo, @bryanyzhu, @suzhoum, @canerturkmen, @giswqs, @gidler, @yzhliu, @Linuxdex and @MountPOTATO
AutoGluon 0.7 supports Python versions 3.8, 3.9, and 3.10. Python 3.7 is no longer supported as of this release.
As of AutoGluon 0.7 release, AutoGluon is now available on conda-forge (#612)!
Kudos to the following individuals for making this happen:
autogluon.eda
(Exploratory Data Analysis)We are happy to announce AutoGluon Exploratory Data Analysis (EDA) toolkit. Starting with v0.7, AutoGluon now can analyze and visualize different aspects of data and models. We invite you to explore the following tutorials: Quick Fit, Dataset Overview, Target Variable Analysis, Covariate Shift Analysis. Other materials can be found in EDA Section of the website.
dask
and distributed
dependencies. @Innixma (#2691)autogluon.text
and autogluon.vision
modules. We recommend using autogluon.multimodal
for text and vision tasks going forward.AutoGluon MultiModal (a.k.a AutoMM) supports three new features: 1) document classification; 2) named entity recognition
for Chinese language; 3) few shot learning with SVM
Meanwhile, we removed autogluon.text
and autogluon.vision
as these features are supported in autogluon.multimodal
FocalLoss
. @yongxinw (#2860)autogluon.vision
namespace is deprecated. @bryanyzhu (#2790, #2819, #2832)autogluon.text
namespace is deprecated. @sxjscience @Innixma (#2695, #2847)infer_limit
, and the high_quality
preset can satisfy <100 ms end-to-end latency on many datasets by default."multimodal"
hyperparameter preset now leverages the full capabilities of MultiModalPredictor, resulting in stronger performance on datasets containing a mix of tabular, image, and text features.NN_TORCH
model to be dataset iterable, leading to a 100% inference speedup. @liangfu (#2395)TabularPredictor.fit
is passed hyperparameters="multimodal"
. @Innixma (#2890)predict_multi
and predict_proba_multi
methods to TabularPredictor
to efficiently get predictions from multiple models. @Innixma (#2727)leaderboard
calls when scoring is disabled. @Innixma (#2912)predict_proba
with problem_type="regression"
. This will raise an exception in a future release. @Innixma (#2684)NN_TORCH
model. @Innixma (#2909)calibrate=True, use_bag_holdout=True
in TabularPredictor.fit
. @Innixma (#2715)n_estimators
with RandomForest / ExtraTrees models. @Innixma (#2735)skl2onnx
. @liangfu (#2923)refit_full
. @Innixma (#2913)compile_models
to the deployment tutorial. @liangfu (#2717)TimeSeriesPredictor
now supports past covariates (a.k.a.dynamic features or related time series which is not known for time steps to be predicted). @shchur (#2665, #2680)TimeSeriesPredictor
for various presets (medium_quality
, high_quality
and best_quality
). @shchur (#2758)Published by tonyhoo almost 2 years ago
v0.6.2 is a security and bug fix release.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/v0.6.1...v0.6.2
Special thanks to @daikikatsuragawa and @yzhliu who were first time contributors to AutoGluon this release!
This version supports Python versions 3.7 to 3.9. 0.6.x are the last releases that will support Python 3.7.
Published by gradientsky almost 2 years ago
v0.6.1 is a security fix / bug fix release.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/v0.6.0...v0.6.1
Special thanks to @lvwerra who is first time contributors to AutoGluon this release!
This version supports Python versions 3.7 to 3.9. 0.6.x are the last releases that will support Python 3.7.
Published by gradientsky almost 2 years ago
v0.5.3 is a security hotfix release.
This release is non-breaking when upgrading from v0.5.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.5.2...v0.5.3
This version supports Python versions 3.7 to 3.9.
Published by gradientsky almost 2 years ago
We're happy to announce the AutoGluon 0.6 release. 0.6 contains major enhancements to Tabular, Multimodal, and Time Series
modules, along with many quality of life improvements and fixes.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
This release contains 263 commits from 25 contributors!
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.5.2...v0.6.0
Special thanks to @cheungdaven, @suzhoum, @BingzhaoZhu, @liangfu, @Harry-zzh, @gidler, @yongxinw, @martinschaef,
@giswqs, @Jalagarto, @geoalgo, @lujiaying and @leloykun who were first time contributors to AutoGluon this release!
Full Contributor List (ordered by # of commits):
@shchur, @yinweisu, @zhiqiangdon, @Innixma, @FANGAreNotGnu, @canerturkmen, @sxjscience, @gradientsky, @cheungdaven,
@bryanyzhu, @suzhoum, @BingzhaoZhu, @yongxinw, @tonyhoo, @liangfu, @Harry-zzh, @Raldir, @gidler, @martinschaef,
@giswqs, @Jalagarto, @geoalgo, @lujiaying, @leloykun, @yiqings
This version supports Python versions 3.7 to 3.9. This is the last release that will support Python 3.7.
AutoGluon Multimodal (a.k.a AutoMM) supports three new features: 1) object detection, 2) named entity recognition, and 3) multimodal matching. In addition, the HPO backend of AutoGluon Multimodal has been upgraded to ray 2.0. It also supports fine-tuning billion-scale FLAN-T5-XL model on a single AWS g4.2x-large instance with improved parameter-efficient finetuning. Starting from 0.6, we recommend using autogluon.multimodal rather than autogluon.text or autogluon.vision and added deprecation warnings.
Object Detection
"object_detection"
.Named Entity Recognition
"ner"
.Multimodal Matching
"text_similarity"
, "image_similarity"
, "image_text_similarity"
.Miscellaneous minor fixes. @cheungdaven @FANGAreNotGnu @geoalgo @zhiqiangdon (#2402, #2409, #2026, #2401, #2418)
New experimental model FT_TRANSFORMER
. @bingzhaozhu, @innixma (#2085, #2379, #2389, #2410)
FT_TRANSFORMER
keyhyperparameters
dictionary or via presets="experimental_best_quality"
.New experimental model compilation support via predictor.compile_models()
. @liangfu, @innixma (#2225, #2260, #2300)
pip install autogluon.tabular[all,skl2onnx]
..compile_models
is called only at the very end.Added predictor.clone(...)
method to allow perfectly cloning a predictor object to a new directory.
This is useful to preserve the state of a predictor prior to altering it
(such as prior to calling .save_space
, .distill
, .compile_models
, or .refit_full
. @innixma (#2071)
Added simplified num_gpus
and num_cpus
arguments to predictor.fit
to control total resources.
@yinweisu, @innixma (#2263)
Improved stability and effectiveness of HPO functionality via various refactors regarding our usage of ray.
@yinweisu, @innixma (#1974, #1990, #2094, #2121, #2133, #2195, #2253, #2263, #2330)
Upgraded dependency versions: XGBoost 1.7, CatBoost 1.1, Scikit-learn 1.1, Pandas 1.5, Scipy 1.9, Numpy 1.23.
@innixma (#2373)
Added python version compatibility check when loading a fitted TabularPredictor.
Will now error if python versions are incompatible. @innixma (#2054)
Added fit_weighted_ensemble
argument to predictor.fit
. This allows the user to disable the weighted ensemble.
@innixma (#2145)
Added cascade ensemble foundation logic. @innixma (#1929)
infer_limit
. @innixma (#2014)Scorer
classes to be easier to use, plus added comprehensive unit tests for all metrics. @innixma (#2242)TimeSeriesPredictor
now supports static features (a.k.a. time series metadata, static covariates) and **DeepAR
and SimpleFeedForward
), removing the dependencyAutoGluonTabular
relies on XGBoost, LightGBM and CatBoost under the hood via the autogluon.tabular
Naive
and SeasonalNaive
forecasters are simple methods that provide strong baselines with no increase inTemporalFusionTransformerMXNet
brings the TFT transformer architecture to AutoGluon. @shchur (#2106,ETS
, ARIMA
Theta
, as well as WeightedEnsemble
. @shchur @canerturkmen (#2001, #2033, #2040, #2067, #2072, #2073, #2180,TimeSeriesPredictor
now handles irregularly sampled time series with ignore_index
. @canerturkmen, @shchur (#1993,TimeSeriesEvaluator
@shchur (#2147, #2150)Improved documentation and new tutorials:
@shchur (#2120, #2127, #2146, #2174, #2187, #2354)
@shchur
@canerturkmen
Published by gradientsky about 2 years ago
v0.5.2 is a security hotfix release.
This release is non-breaking when upgrading from v0.5.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.5.1...v0.5.2
This version supports Python versions 3.7 to 3.9.
Published by gradientsky about 2 years ago
v0.4.3 is a security hotfix release.
This release is non-breaking when upgrading from v0.4.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.4.2...v0.4.3
This version supports Python versions 3.7 to 3.9.
Published by gradientsky over 2 years ago
We're happy to announce the AutoGluon 0.5 release. This release contains major optimizations and bug fixes to autogluon.multimodal and autogluon.timeseries modules, as well as inference speed improvements to autogluon.tabular.
This release is non-breaking when upgrading from v0.5.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 58 commits from 14 contributors!
Full Contributor List (ordered by # of commits):
This version supports Python versions 3.7 to 3.9.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.5.0...v0.5.1
Changed to a new namespace autogluon.multimodal
(AutoMM), which is a deep learning "model zoo" of model zoos. On one hand, AutoMM can automatically train deep models for unimodal (image-only, text-only or tabular-only) problems. On the other hand, AutoMM can automatically solve multimodal (any combinations of image, text, and tabular) problems by fusing multiple deep learning models. In addition, AutoMM can be used as a base model in AutoGluon Tabular and participate in the model ensemble.
Supported zero-shot learning with CLIP (#1922) @zhiqiangdon
Improved efficient finetuning
Added more data augmentation techniques
Enhanced teacher-student model distillation
Beginner tutorials of applying AutoMM to image, text, or multimodal (including tabular) data. (#1861, #1908, #1858, #1869) @bryanyzhu @sxjscience @zhiqiangdon
A zero-shot image classification tutorial with the CLIP model. (#1942) @bryanyzhu
A tutorial of using CLIP model to extract embeddings for image-text retrieval. (#1957) @bryanyzhu
A tutorial to introduce comprehensive AutoMM configurations (#1861). @zhiqiangdon
AutoMM for tabular data examples (#1752, #1893, #1903). @yiqings
AutoMM distillation example (#1846). @FANGAreNotGnu
A Kaggle notebook about how to use AutoMM to predict pet adoption: https://www.kaggle.com/code/linuxdex/use-autogluon-to-predict-pet-adoption. The model achieves the score equivalent to top 1% (20th/3537) in this kernel-only competition (test data is only available in the kernel without internet access) (#1796, #1847, #1894, #1943). @Linuxdex
Published by Innixma over 2 years ago
We're happy to announce the AutoGluon 0.5 release. This release contains major new modules autogluon.timeseries
and autogluon.multimodal
. In collaboration with the Yu Group of Statistics and EECS from UC Berkeley, we have added interpretable models (imodels) to autogluon.tabular
.
This release is non-breaking when upgrading from v0.4.2. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 91 commits from 13 contributors!
Full Contributor List (ordered by # of commits):
The imodels integration is based on the following work,
Singh, C., Nasseri, K., Tan, Y.S., Tang, T. and Yu, B., 2021. imodels: a python package for fitting interpretable models. Journal of Open Source Software, 6(61), p.3192.
This version supports Python versions 3.7 to 3.9.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.4.1...v0.5.0
Full release notes will be available shortly.
Published by gradientsky over 2 years ago
v0.4.2 is a hotfix release to fix breaking change in protobuf.
This release is non-breaking when upgrading from v0.4.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.4.1...v0.4.2
This version supports Python versions 3.7 to 3.9.
Published by gradientsky over 2 years ago
We're happy to announce the AutoGluon 0.4.1 release. 0.4.1 contains minor enhancements to Tabular, Text, Image, and Multimodal modules, along with many quality of life improvements and fixes.
This release is non-breaking when upgrading from v0.4.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 55 commits from 10 contributors!
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.4.0...v0.4.1
Special thanks to @yiqings, @leandroimail, @huibinshen who were first time contributors to AutoGluon this release!
Full Contributor List (ordered by # of commits):
This version supports Python versions 3.7 to 3.9.
Added optimization.efficient_finetune
flag to support multiple efficient finetuning algorithms. (#1666) @sxjscience
bit_fit
: "BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models"
norm_fit
: An extension of the algorithm in "Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs" and BitFit. We finetune both the parameters in the norm layers as long as the biases.Enabled knowledge distillation for AutoMM (#1670) @zhiqiangdon
AutoMMPredictor
reuses the .fit()
function:from autogluon.text.automm import AutoMMPredictor
teacher_predictor = AutoMMPredictor(label="label_column").fit(train_data)
student_predictor = AutoMMPredictor(label="label_column").fit(
train_data,
hyperparameters=student_and_distiller_hparams,
teacher_predictor=teacher_predictor,
)
Option to turn on returning feature column information (#1711) @zhiqiangdon
requires_column_info
flag in data processors and a utility function to turn this flag on or off.FT-Transformer implementation for tabular data in AutoMM (#1646) @yiqings
Make CLIP support multiple images per sample (#1606) @zhiqiangdon
Avoid using eos
as the sep token for CLIP. (#1710) @zhiqiangdon
Update fusion transformer in AutoMM (#1712) @yiqings
polynomial_decay
scheduler.[CLS]
token in numerical/categorical transformer.Added more image augmentations: verticalflip
, colorjitter
, randomaffine
(#1719) @Linuxdex, @sxjscience
Added prompts for the percentage of missing images during image column detection. (#1623) @zhiqiangdon
Support average_precision
in AutoMM (#1697) @sxjscience
Convert roc_auc
/ average_precision
to log_loss
for torchmetrics (#1715) @zhiqiangdon
torchmetrics.AUROC
requires both positive and negative examples are available in a mini-batch. When training a large model, the per gpu batch size is probably small, leading to an incorrect roc_auc
score. Conversion from roc_auc
to log_loss
improves training stablility.Added pytorch-lightning
1.6 support (#1716) @sxjscience
Updated the names of top-k checkpoint average methods and support customizing model names for terminal input (#1668) @zhiqiangdon
union_soup
-> uniform_soup
and best_soup
-> best
.customize_config_names
-> customize_model_names
and verify_config_names
-> verify_model_names
) to make it easier to understand them.Implemented the GreedySoup algorithm proposed in paper. Added union_soup
, greedy_soup
, best_soup
flags and changed the default value correspondingly. (#1613) @sxjscience
Updated the standalone
flag in automm.predictor.save()
to save the pertained model for offline deployment (#1575) @yiqings
Simplified checkpoint template (#1636) @zhiqiangdon
AutoMMPredictor
's final model checkpoint.ckpt_path
argument to pytorch lightning's trainer only when resume=True
.Unified AutoMM's model output format and support customizing model names (#1643) @zhiqiangdon
timm_image
, hf_text
, clip
, numerical_mlp
, categorical_mlp
, and fusion_mlp
) as prefixes. This is helpful when users want to simultaneously use two models of the same type, e.g., hf_text
. They can just use names hf_text_0
and hf_text_1
.Support standalone
feature in TextPredictor
(#1651) @yiqings
Fixed saving and loading tokenizers and text processors (#1656) @zhiqiangdon
0.4.0
.Change load from a classmethod to staticmethod to avoid incorrect usage. (#1697) @sxjscience
Added AutoMMModelCheckpoint
to avoid evaluating the models to obtain the scores (#1716) @sxjscience
Extract column features from AutoMM's model outputs (#1718) @zhiqiangdon
timm_image
, hf_text
, and clip
.Make AutoMM dataloader return feature column information (#1710) @zhiqiangdon
Fixed calling save_pretrained_configs
in AutoMMPrediction.save(standalone=True)
when no fusion model exists (here) (#1651) @yiqings
Fixed error raising for setting key that does not exist in the configuration (#1613) @sxjscience
Fixed warning message about bf16. (#1625) @sxjscience
Fixed the corner case of calculating the gradient accumulation step (#1633) @sxjscience
Fixes for top-k averaging in the multi-gpu setting (#1707) @zhiqiangdon
Limited RF max_leaf_nodes
to 15000 (previously uncapped) (#1717) @Innixma
high_quality
preset.Limit KNN to 32 CPUs to avoid OpenBLAS error (#1722) @Innixma
BLAS : Program is Terminated. Because you tried to allocate too many memory regions.
Segmentation fault: 11
This error occurred when the machine had many CPU cores (>64 vCPUs) due to too many threads being created at once. By limiting to 32 cores used, the error is avoided.
Improved memory warning thresholds (#1626) @Innixma
Added get_results
and model_base_kwargs
(#1618) @Innixma
get_results
to searchers, useful for debugging and for future extensions to HPO functionality.BaggedEnsembleModel
that avoids having to init the base model prior to initing the bagged ensemble model.Update resource logic in models (#1689) @Innixma
auto
for resources, fixed in this PR.get_minimum_resources
to explicitly define minimum resource requirements within a method.Updated feature importance default subsample_size
1000 -> 5000, num_shuffle_sets 3
-> 5 (#1708) @Innixma
Added notice to ensure serializable custom metrics (#1705) @Innixma
Fixed evaluate
when weight_evaluation=True
(#1612) @Innixma
predictor.evaluate(...)
or predictor.evaluate_predictions(...)
when self.weight_evaluation==True
.Fixed RuntimeError: dictionary changed size during iteration (#1684, #1685) @leandroimail
Fixed CatBoost custom metric & F1 support (#1690) @Innixma
Fixed HPO not working for bagged models if the bagged model is loaded from disk (#1702) @Innixma
Fixed Feature importance erroring if self.model_best
is None
(can happen if no Weighted Ensemble is fit) (#1702) @Innixma
updated the text tutorial of cutomizing hyperparameters (#1620) @zhiqiangdon
Improved implementations and docstrings of save_pretrained_models
and convert_checkpoint_name
. (#1656) @zhiqiangdon
Added cheat sheet to website (#1605) @yinweisu
Doc fix to use correct predictor when calling leaderboard (#1652) @Innixma
[security] updated pillow
to 9.0.1
+ (#1615) @gradientsky
[security] updated ray
to 1.10.0
+ (#1616) @yinweisu
Tabular regression tests improvements (#1555) @willsmithorg
TabularPredictor
on both regression and classification tasks, multiple presets etc.Disabled image/text predictor when gpu is not available in TabularPredictor
(#1676) @yinweisu
Use class property to set keys in model classes. In this way, if we customize the prefix key, other keys are automatically updated. (#1669) @zhiqiangdon
Published by Innixma over 2 years ago
We're happy to announce the AutoGluon 0.4 release. 0.4 contains major enhancements to Tabular and Text modules, along with many quality of life improvements and fixes.
This release is non-breaking when upgrading from v0.3.1. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 151 commits from 14 contributors!
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.3.1...v0.4.0
Special thanks to @zhiqiangdon, @willsmithorg, @DolanTheMFWizard, @truebluejason, @killerSwitch, and @Xilorole who were first time contributors to AutoGluon this release!
Full Contributor List (ordered by # of commits):
This version supports Python versions 3.7 to 3.9.
pip install autogluon.text
will error on import if installed standalone due to missing autogluon.features
as a dependency. To fix: pip install autogluon.features
. This will be resolved in v0.4.1 release.AutoGluon-Text is refactored with PyTorch Lightning. It now supports backbones in huggingface/transformers. The new version has better performance, faster training time, and faster inference speed. In addition, AutoGluon-Text now supports solving multilingual problems and a new AutoMMPredictor
has been implemented for automatically building multimodal DL models.
presets="high_quality"
, the win-rate increased to 77.8% thanks to the DeBERTa-v3 backbone.presets="multilingual"
. You can now train a model on the English dataset and directly apply the model on datasets in other languages such as German, Japanese, Italian, etc..fit()
again after a previous trained model has been loaded.Thanks to @zhiqiangdon and @sxjscience for contributing the AutoGluon-Text refactors! (#1537, #1547, #1557, #1565, #1571, #1574, #1578, #1579, #1581, #1585, #1586)
AutoGluon-Tabular has been majorly enhanced by numerous optimizations in 0.4. In summation, these improvements have led to a:
Specific updates:
infer_limit
and infer_limit_batch_size
as new fit-time constraints (Tutorial). This allows users to specifyTabularPredictor.fit_pseudolabel(...)
! @DolanTheMFWizard (#1323, #1382)TabularPredictor.fit(..., feature_prune_kwargs={})
! @truebluejason (#1274, #1305)calibrate
fit argument. @DolanTheMFWizard (#1336, #1374, #1502)refit_full
logic to majorly simplify user model contributions and improve multimodal support with advanced presets. @Innixma (#1567)As part of the migration from MXNet to Torch, we have created a Torch based counterpart
to the prior MXNet tabular neural network model. This model has several major advantages, such as:
This model has replaced the MXNet tabular neural network model in the default hyperparameters configuration,
and is enabled by default.
Thanks to @jwmueller and @Innixma for contributing TabularNeuralNetTorchModel to AutoGluon! (#1489)
VowpalWabbit has been added as a new model in AutoGluon. VowpalWabbit is not installed by default, and must be installed separately.
VowpalWabbit is used in the hyperparameters='multimodal'
preset, and the model is a great option to use for datasets containing text features.
To install VowpalWabbit, specify it via pip install autogluon.tabular[all, vowpalwabbit]
or pip install "vowpalwabbit>=8.10,<8.11"
Thanks to @killerSwitch for contributing VowpalWabbitModel to AutoGluon! (#1422)
Linear models have been accelerated by 20x in training and 20x in inference thanks to a variety of optimizations.
To get the accelerated training speeds, please install scikit-learn-intelex via pip install "scikit-learn-intelex>=2021.5,<2021.6"
Note that currently LinearModel is not enabled by default in AutoGluon,
and must be specified in hyperparameters
via the key 'LR'
.
Further testing is planned to incorporate LinearModel as a default model in future releases.
Thanks to the scikit-learn-intelex
team and @Innixma for the LinearModel optimizations! (#1378)
autogluon.common
to simplify dependency management for future submodules. @Innixma (#1386)autogluon.mxnet
and autogluon.extra
submodules as part of code cleanup. @Innixma (#1397, #1411, #1414)Published by Innixma about 3 years ago
v0.3.1 is a hotfix release which fixes several major bugs as well as including several model quality improvements.
This release is non-breaking when upgrading from v0.3.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 9 commits from 4 contributors.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.3.0...v0.3.1
Thanks to the 4 contributors that contributed to the v0.3.1 release!
Special thanks to @yinweisu who is a first time contributor to AutoGluon and fixed a major bug in ImagePredictor HPO!
Full Contributor List (ordered by # of commits):
@Innixma, @gradientsky, @yinweisu, @sackoh
best_quality
preset.-1
as n_jobs
value instead of using os.cpu_count()
. @sackoh (#1289)Published by Innixma about 3 years ago
v0.3.0 introduces multi-modal image, text, tabular support to AutoGluon. In just a few lines of code, you can train a multi-layer stack ensemble using text, image, and tabular data! To our knowledge this is the first publicly available implementation of a model that handles all 3 modalities at once. Check it out in our brand new multimodal tutorial! v0.3.0 also features a major model quality improvement for Tabular, with a 57.6% winrate vs v0.2.0 on the AutoMLBenchmark, along with an up to 10x online inference speedup due to low level numpy and pandas optimizations throughout the codebase! This inference optimization enables AutoGluon to have sub 30 millisecond end-to-end latency for real-time deployment scenarios when paired with model distillation. Finally, AutoGluon can now train PyTorch image models via integration with TIMM. Specify any TIMM model to ImagePredictor
or TabularPredictor
to train them with AutoGluon!
This release is non-breaking when upgrading from v0.2.0. As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 70 commits from 10 contributors.
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.2.0...v0.3.0
Thanks to the 10 contributors that contributed to the v0.3.0 release!
Special thanks to the 3 first-time contributors! @rxjx, @sallypannn, @sarahyurick
Special thanks to @talhaanwarch who opened 21 GitHub issues (!) and participated in numerous discussions during v0.3.0 development. His feedback was incredibly valuable when diagnosing issues and improving the user experience throughout AutoGluon!
Full Contributor List (ordered by # of commits):
@Innixma, @zhreshold, @jwmueller, @gradientsky, @sxjscience, @ValerioPerrone, @taesup-aws, @sallypannn, @rxjx, @sarahyurick
tree_method='hist'
for improved performance. @Innixma (#1239)groups
parameter. Now users can specify the exact split indices in a groups
column when performing model bagging. This solution leverages sklearn's LeaveOneGroupOut cross-validator. @Innixma (#1224)use_bag_holdout
argument. @Innixma (#1105)predictor.features()
to get the original feature names used during training. @Innixma (#1257)problem_type
support to ImagePredictor. @sallypannn (#1165)predict_proba
. @Innixma (#1206)eval_metric='average_precision'
. @rxjx (#1092)__version__
. @Innixma (#1122)