ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

APACHE-2.0 License

Downloads
4.6K
Stars
10.9K
Committers
158
ludwig - v0.7.1

Published by tgaddair over 1 year ago

What's Changed

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.7...v0.7.1

ludwig - v0.7

Published by tgaddair over 1 year ago

Key Highlights

  • Pretrained Vision Models: we’ve added 20 additional TorchVision pretrained models as image encoders, including: AlexNet, EfficientNet, MobileNet v3, and GoogleLeNet.
  • Image Augmentation: Ludwig v0.7 also introduces image augmentation, artificially increasing the size of the training dataset by applying a randomized set of transformations to each batch of images during training.
  • 50x Faster Fine-Tuning via Automatic Mixed Precision (AMP) Training, Cached Encoder Embeddings, Approximate Training Set evaluation, and automatic batch sizing by default to maximize throughput.
  • New Distributed Training Strategies: Distributed Data Parallel (DDP) and Fully Sharded Data Parallel (FSDP)
  • Ray 2.0, 2.1, 2.2 and 2.3 support
  • A new Ludwig profiler for benchmarking various CPU/GPU performance metrics, as well as comparing different Ludwig model runs.
  • Revamped Ludwig datasets API with an even larger number of datasets out of the box.
  • API annotations within Ludwig for contributors and Python users
  • Schemification of the entire Ludwig Config object for better validation and checks upfront.

What's Changed

New Contributors

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.6.4...v0.7

ludwig - v0.7.beta

Published by justinxzhao over 1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5.3...v0.7.beta

ludwig - v0.6.4

Published by arnavgarg1 almost 2 years ago

ludwig - v0.6.3

Published by justinxzhao about 2 years ago

What's Changed

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.6.2...v0.6.3

ludwig - v0.6.2

Published by justinxzhao about 2 years ago

What's Changed

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.6.1...v0.6.2

ludwig - v0.6.1

Published by justinxzhao about 2 years ago

What's Changed

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.6...v0.6.1

Overview

Ludwig 0.6 introduces several exciting features focused on modeling, deployment, and testing that make it more flexible, reliable, and easy to use in production.

  • Gradient boosted models: Historically, Ludwig has been built around a single, flexible neural network architecture called ECD (for Encoder-Combiner-Decoder). With the release of 0.6 we are adding support for a different model architecture: gradient-boosted tree models (GBMs).
  • Richer configuration schema and validation: We formalized the schema of Ludwig configurations and now validate it before initialization, which can help you avoid mistakes like typos and syntax errors.
  • Probability calibration for binary and multi-class classification: With deep neural networks, the probabilities given by models often don't match the true likelihood of the data. Ludwig now supports temperature scaling calibration (On Calibration of Modern Neural Networks), which brings class probabilities closer to their true likelihoods in the validation set.
  • Pipelined TorchScript: We improved the TorchScript model export functionality, making it easier than ever to train and deploy models for high performance inference.
  • Model parameter update unit tests: The code to update parameters of deep neural networks can be too complex for developers to make sure the model parameters are updated. To address this difficulty and improve the robustness of our models, we implemented a reusable utility to ensure parameters are updated during one cycle of a forward-pass / backward-pass / optimizer step.

Additional improvements include a new global configuration section, time-based dataset splitting and more flexible hyperparameter optimization configurations. Read more about each specific feature below.

If you are learning about Ludwig for the first time, or if these new features are relevant and exciting to your research or application, we'd love to hear from you. Join our Ludwig Slack Community here.

Gradient Boosted Models (@jppgks)

Historically, Ludwig has been built around a single, flexible neural network architecture called ECD (for Encoder-Combiner-Decoder). With the release of 0.6 we are, adding support for a different model architecture: gradient-boosted tree models (GBM).

This is motivated by the fact that tree models still outperform neural networks on some tabular datasets, and the fact that tree models are generally less compute-intensive, making them a better choice for some applications. In Ludwig, users can now experiment with both neural and tree-based architectures within the same framework, taking advantage of all of the additional functionalities and conveniences that Ludwig offers like: preprocessing, hyperparameter optimization, integration with different backends (local, ray, horovod), and interoperability with different data sources (pandas, dask, modin).

How to use it

Install the tree extra package with pip install ludwig[tree]. After the installation, you can use the new gbm model type in the configuration. Ludwig will default to using the ECD architecture, which can be overridden as follows to use GBM:

image

In some initial benchmarking we found that GBMs are particularly performant on smaller tabular datasets and can sometimes deal better with class imbalance compared to neural networks. Stay tuned for a more in-depth blogpost on the topic. Like the ECD neural networks, GBMs can be sensitive to hyperparameter values, and hyperparameter tuning is important to get a well-performing model.

Under the hood, Ludwig uses LightGBM for training gradient-boosted tree models, and the LightGBM trainer parameters can be configured in the trainer section of the configuration. For serving, the LightGBM model is converted to a PyTorch graph using Hummingbird for efficient evaluation and inference.

Limitations

Ludwig's initial support for GBM is limited to tabular data (binary, categorical and numeric features) with a single output feature target.

Calibrating probabilities for category and binary output features (@dantreiman)

Suppose your model outputs a class probability of 90%. Is there a 90% chance that the model prediction is correct? Do the probabilities given by your model match the true likelihood of the data? With deep neural networks, they often don't.

Drawing on the methods described in On Calibration of Modern Neural Networks (Chuan Guo, Geoff Pleiss, Yu Sun, Kilian Q. Weinberger), Ludwig now supports temperature scaling for binary and category output features. Temperature scaling brings a model's output probabilities closer to the true likelihood while preserving the same accuracy and top k predictions.

How to use Calibration

To enable calibration, add calibration: true to any binary or category output feature configuration:

image

With calibration enabled, Ludwig will find a scale factor (temperature) which will bring the class probabilities closer to their true likelihoods in the validation set. The calibration scale factor is determined in a short phase after training is complete. If no validation split is provided, the training set is used instead.

To visualize the effects of calibration in Ludwig, you can use Calibration Plots, which bin the data based on model probability and plot the model probability (X) versus observed (Y) for each bin (see code examples).

image

In a perfectly calibrated model, the observed probability equals the predicted probability, and all predictions will land on the dotted line y=x. In this example using the forest cover dataset, the uncalibrated model in blue gives over-confident predictions near the left and right edges close to probability values of 0 or 1. Temperature scaling learns a scale factor of 0.51 which improves the calibration curve in orange, moving it closer to y=x.

Limitations

Calibration is currently limited to models with binary and category output features.

Richer configuration schema and validation (@connor-mccorm @ksbrar @justinxzhao )

Ludwig configurations are flexible by design, as they internally map to Python function signatures. This allows configurations for expressive configurations with many parameters for the users to play with, but we have found that users would too easily have typos in their configs like incorrect value types or other syntactical inconsistencies that were not easy to catch.

We have now formalized the Ludwig config with a strongly typed schema, serving as a centralized source of truth for parameter documentation and config validation. Ludwig validation now explicitly restricts each parameter's values to valid ones, decreasing the chance of syntactical and logical errors and signaling immediately to the user where the issues lie, before processing data or starting training. Schemas also provide many future benefits including autocompletion.

Nested encoder and decoder parameters (@connor-mccorm )

We have also restructured the way that encoders and decoders are configured to now use a nested structure, consistent with other modules in Ludwig such as combiners and loss.

image

As these changes impact what constitutes a valid Ludwig config, we also introduced a mechanism for ensuring backward compatibility that invisibly and automatically upgrades older configs to the current config structure.

We hope with the new Ludwig schema and the improved encoder/decoder nesting structure, that you find using Ludwig to be a much more robust and user friendly experience!

New Defaults Ludwig Section (@arnavgarg1 )

In Ludwig 0.5, users could specify global preprocessing parameters on a per-feature-type basis through the preprocessing section in Ludwig configs. This is useful if users know they always want to apply certain transformations to their data for every feature of the same type. However, there was no equivalent mechanism for global encoder, decoder or loss related parameters.

For example, say we have a mammography dataset to predict breast cancer that contains many categorical features. In Ludwig 0.5, we might define our input features with encoder parameters in the following way:

image

Here, the problem is that we have to redefine the same encoder parameters (type, dropout, and embedding_size) for each of the input features if we want to override the default value across all categorical features.

In Ludwig 0.6, we are introducing a new defaults section within the Ludwig config to define feature-type defaults for preprocessing, encoders, decoders, and loss. Default preprocessing and encoder configurations will be applied to all input_features of that feature type, while decoder and loss configurations will be applied to all output_features of that feature type.

Note that you can still specify feature specific parameters as usual, and these will override any default parameter values that come from the global defaults section.

The same mammography config above could be defined in the following, much more concise way in Ludwig 0.6:

image

Here, the encoder defaults for type, dropout and embedding_size are applied to all three categorical features. The he_normal embedding initializer is only applied to tumor_size and inv_nodes since we didn't specify this parameter in their feature definitions, but breast_quadrant will use the glorot_normal initializer since it will override the value from the defaults section.

Additionally, in Ludwig 0.6, we have moved all global feature-type preprocessing within this new defaults section from the preprocessing section.

The defaults section enables the same fine-grained control with the benefit of making your config easier to define and read.

Global Defaults In Hyperopt (@arnavgarg1 )

The defaults section has also been added to hyperopt, so that users can define feature-type level parameters for individual trials. This makes the definition of the hyperopt search space more convenient, without the need to define individual parameters for each of the features in instances where the dataset has a large number of input or output features.

For example, if you want to hyperopt over different encoders for all text features for each of the trials, one can do so by defining a parameter this way:

image

This will sample one of the three encoders for text features and apply it to all the text features for that particular trial.

Nested Configs In Hyperopt (@tgaddair )

We have extended the range of hyperopt parameters to support parameter choices that consist of partial or complete blocks of nested Ludwig config sections. This allows users to search over a set of Ludwig configs, as opposed to needing to specify config params individually and search over all combinations.

To provide a parameter that represents a full top-level Ludwig config, the . key name can be used.

For example, we can define a hyperopt search space where we sample partial Ludwig configs in the following way would create hyperopt samples that look like the following:

image

Pipelined TorchScript (@geoffreyangus @brightsparc )

In Ludwig v0.6, we improved the TorchScript model export functionality, making it easier than ever to train and deploy models for high performance inference.

At the core of our implementation is a pipeline-based approach to exporting models. After training a Ludwig model, users can run the export_torchscript command in the CLI, or call LudwigModel.save_torchscript. If model training was performed on a GPU device, doing so produces three new TorchScript artifacts:

image

These artifacts represent a single LudwigModel as three modules, each separated by stage: preprocessing, prediction, and postprocessing. These artifacts can be pipelined together using the InferenceModule class method InferenceModule.from_directory, or with some tools such as NVIDIA Triton.

One of the most significant benefits is that TorchScripted models are backend and environment independent and different parts can run on different hardware to maximize throughput. They can be loaded up in either a C++ or Python backend, and in either, minimal dependencies are required to run model inference. Such characteristics ensure that the model itself is both highly portable and backward compatible.

Time-based Dataset Splitting (@tgaddair )

In Ludwig v0.6, we have added the ability to split based on a date column such that the data is ordered by date (ascending) and then split into train-validation-test along the time dimension. To make this possible, we have reworked the way splitting is handled in the Ludwig configuration to support a dedicated split section:

image

In this example, by setting probabilities: [0.7, 0.1, 0.2], the earliest 70% of the data will be used for training, the middle 10% used for validation, and the last 20% used for testing.

This feature is important to support backtesting strategies where the user needs to know if a model trained on historical data would have performed well on unseen future data. If we were to use a uniformly random split strategy in these cases, then the model performance may not reflect the model's ability to generalize well if the data distribution is subject to change over time. For example, imagine a model that is predicting housing prices. If we both train and test on data from around the same time, we may fool ourselves into believing our model has learned something fundamental about housing valuations when in reality it might just be basing its predictions on recent trends in the market (trends that will likely change once the model is put into production). Splitting the training from the test data along the time dimension is one way to avoid this false sense of confidence, by showing how well the model should do on unseen data from the future.

Prior to Ludwig v0.6, the preprocessing configuration supported splitting based on a split column, split probabilities (train-val-test), or stratified splitting based on a category, all of which were flattened into the top-level of the preprocessing section:

image

This approach was limiting in that every new split type required reconciling all of the above params and determining how they should interact with the new type. To resolve this complexity, all of the existing split types have been similarly reworked to follow the new structure supported for datetime splitting.

Examples

Splitting by row at random (default):

image

Splitting based on a fixed column.

image

Stratified splits using a chosen stratification category column.

image

Be on the lookout as we continue to add additional split strategies in the future to support advanced usage such as bucketed backtesting. If you are interested in these kinds of scenarios, please reach out!

Parameter Update Unit Tests (@jimthompson5802 )

A significant step was taken in this release to improve the code quality of Ludwig components, e.g., encoders, combiners, and decoders. Deep neural networks have many layers composed of a large number of parameters that must be updated to converge to a solution. Depending on the particular algorithm, the code for updating parameters during training can be quite complex. As a result, it is near impossible for a developer to reason through an analysis that confirms model parameters are updated.

To address this difficulty, we implemented a reusable utility to perform a quick sanity check to ensure parameters, such as tensor weights and biases, are updated during one cycle of a forward-pass / backward-pass / optimizer step. This work was inspired by these earlier blog postings: How to unit test machine learning code and Testing Your PyTorch Models with Torcheck.

This utility was added to unit tests for existing Ludwig components. With this addition, unit tests for Ludwig now ensure the following:

  • No run-time exceptions are raised
  • Generated output are the correct data type and shape
  • (New capability) Model parameters are updated as expected

image

The above is an example of a unit test. First, it sets the random number seed to ensure repeatability. Next, the test instantiates the Ludwig component and processes synthetic data to ensure the component does not raise an error and that the output has the expected shape. Finally, the unit test checks if the parameters are updated under the different combinations of configuration settings.

In addition to the new parameter update check utility, Ludwig's Developer Guide contains instructions for using the utility. This allows an advanced user or a contributor, who is developing custom encoders, combiners, or decoders, to ensure the quality of their custom component.

Stay in the loop

Ludwig thriving open source community gathers on Slack, join it to get involved!

If you are interested in adopting Ludwig in the enterprise, check out Predibase, the declarative ML platform that connects with your data, manages the training, iteration, and deployment of your models, and makes them available for querying, reducing time to value of machine learning projects.

Full Changelog

New Contributors

Congratulations to our new contributors!

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5.3...v0.6

ludwig - v0.6rc1

Published by justinxzhao about 2 years ago

What's Changed

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.6.beta...v0.6rc1

ludwig - v0.6.beta

Published by justinxzhao about 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5.3...v0.6.beta

ludwig - v0.5.5

Published by arnavgarg1 about 2 years ago

What's Changed

  • Bump Ludwig From v0.5.4 -> v0.5.5 by @arnavgarg1 in https://github.com/ludwig-ai/ludwig/pull/2340
    • Bug fix: Use safe rename which works across filesystems when writing checkpoints
    • Fixed default eval_batch_size when setting batch_size=auto
    • Update R2 score to handle single sample computation

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5.4...v0.5.5

ludwig - v0.5.4

Published by justinxzhao over 2 years ago

What's Changed

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5.3...v0.5.4

ludwig - v0.5.3

Published by justinxzhao over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5.2...v0.5.3

ludwig - v0.5.2

Published by justinxzhao over 2 years ago

What's Changed

New Contributors

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5.1...v0.5.2

ludwig - v0.5.1

Published by justinxzhao over 2 years ago

What's Changed

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.5...v0.5.1

ludwig - v0.5: Declarative Machine Learning, now on PyTorch

Published by justinxzhao over 2 years ago

Ludwig v0.5 is a complete renovation of Ludwig from the ground up with a focus on parity, scalability, deployment, reliability, and documentation. Ludwig v0.5 migrates our entire backend from TensorFlow to PyTorch and introduces several new features and technical improvements, including:

  • Step-based training and evaluation to enable frequent sub-epoch monitoring of model health and evaluation metrics. This is particularly useful for large datasets that may be trained using large models.
  • Data balancing: upsampling and downsampling during preprocessing to better proportioned datasets.
  • End-to-end torchscript to support low-level optimized model deployment, including preprocessing and post-processing, to go directly from example to predictions.
  • Ludwig on Ray with RayDatasets enabling significant training speed boosts for reading large datasets while training Ludwig models on a Ray cluster.
  • The addition of MLPMixer and ViTEncoder as image encoders for state-of-the-art deep learning on image data.
  • AutoML for tabular and text classification, integrated with distributed hyperparameter search using RayTune.
  • Scalability optimizations with Dask, Modin, and Ray, enabling Ludwig to preprocess, train, and evaluate over datasets hundreds of gigabytes in size in tens of minutes.
  • Config validation using marshmallow schemas revealing configuration typos or bad values early and increasing reliability.
  • More tests. We've quadrupled the number of unit tests and end-to-end integration tests and we've expanded our CI testing to run in distributed and GPU settings. This strengthens Ludwig's stability and helps build confidence in new changes going forward.

Our team is thoroughly invested in improving the declarative ML experience, and, as part of the v0.5 release, we've revamped the getting started guide, user guide, and developer documentation. We've also published a handful of end-to-end tutorials with thoroughly documented notebooks on text, tabular, image, and multimodal classification that provide a deep walkthrough of Ludwig's functionality.

Migrating to PyTorch

Ludwig's migration to PyTorch comes from a substantial 6 month undertaking involving 230+ commits, changes to 70k+ lines of code, and contributions from 40+ people.

PyTorch's pythonic design and emphasis on developer experience are well-aligned with Ludwig's principles of simplicity, modularity, and extensibility. Switching to use PyTorch as Ludwig’s backend of choice was strongly motivated by the increase in productivity in development, debugging, and iteration that the more pythonic PyTorch API affords us as well as the great ecosystem the PyTorch community has built around it. With Ludwig on PyTorch, we're thrilled to see what developers, researchers, and data scientists in the PyTorch and broader deep learning community can bring to Ludwig.

Feature and Performance Parity

Over the last several months, we've moved all Ludwig encoders, combiners, decoders, and metrics for every data modality that Ludwig supports, as well as all of the backend infrastructure on Horovod and Ray, to PyTorch.

At the same time, we wanted to make sure that the experience of Ludwig users continues to be performant and delightful. We've run extensive comparisons between Ludwig v0.5 (PyTorch-based) and Ludwig v0.4 on text, image, and tabular datasets, evaluating training speed, inference throughput, and model performance, to verify that there's been no degradation.

Our results reveal roughly the same high GPU utilization (~90%) on several datasets with significant improvements in distributed training speed and memory usage without impacting model accuracy nor time to convergence. We'll be publishing a blog with more details on benchmarking soon.

New Features

In addition to the PyTorch migration, Ludwig v0.5 is packed with new functionality, features, and additional changes that make v0.5 the most feature-rich and robust release of Ludwig yet.

Step-based training and evaluation

Ludwig's train loop is epoch-based by default, with one round of evaluation per epoch (one pass through the dataset).

for epoch in num_epochs:
	for batch in training_data.batches:
		train(batch)
        save_model(model_dir)
	evaluation(training_data)
        evaluation(validation_data)
        evaluation(test_data)
        print_results()

This is an appropriate fit for tabular datasets, which are small, fit in memory, and train quickly. However, this can be awkward for unstructured datasets, which tend to be much larger, and train more slowly due to larger models. Now, with step-based training and evaluation, users can configure a more frequent sub-epoch evaluation cadence to more regularly monitor metrics and model health.

Use steps_per_checkpoint to run evaluation every N training steps, or checkpoints_per_epoch to run evaluation N times per epoch.

trainer:
    steps_per_checkpoint: 1000
trainer:
    checkpoints_per_epoch: 2

Note that it is invalid to specify both checkpoints_per_epoch and steps_per_checkpoint simultaneously.

To further speed up evaluation, users can skip evaluation on the training set by setting evaluate_training_set to False.

trainer:
    evaluate_training_set: false

Data balancing

Users working with imbalanced datasets can specify an oversampling or undersampling parameter which will balance the data during preprocessing.

In this example, Ludwig will oversample the minority class to achieve a 50% representation in the overall dataset.

preprocessing:
    oversample_minority: 0.5

In this example, Ludwig will undersample the majority class to achieve a 70% representation in the overall dataset.

preprocessing:
    undersample_majority: 0.7

Data balancing is only supported for binary output classes. Specifying both parameters at the same time is also not supported.
When developing models, it can be useful to iterate quickly with a smaller portion of the dataset. Ludwig supports this with a new preprocessing parameter, sample_ratio, which subsamples the dataset.

preprocessing:
    sample_ratio: 0.7

End-to-end torchscript

Users can export trained ludwig models to torchscript with ludwig export_torchscript.

ludwig export_torchscript –model=/path/to/model

Models that use number, category, and text binary features now support torchscript-compatible preprocessing, enabling end-to-end torchscript compilation.

inputs = {
    'cat_feature': ['foo', 'bar']
    'num_feature': torch.tensor([42, 7])
    'bin_feature1': torch.tensor([True, False])
    'bin_feature2': ['No', 'Yes']
}

scripted_model = model.to_torchscript()
output = scripted_model(inputs)

End to end torchscript compilation is also supported for text features that use torchscript-enabled torchtext tokenizers. We are actively working on adding support for other data types.

AutoML for Text Classification

In v0.4, we introduced experimental AutoML functionalities into Ludwig.

Ludwig AutoML automatically creates deep learning models given a dataset, its label column, and a time budget. Ludwig AutoML infers the input and output feature types, chooses the model architecture, and specifies the parameters and ranges across which to perform hyperparameter search.

auto_train_results = ludwig.automl.auto_train(
   dataset=my_dataset_df,
   target=target_column_name,
   time_limit_s=7200,
   tune_for_memory=False
)

Our initial AutoML work focused on tabular datasets, since good performance on such datasets is a current area of interest in the DL community. In v0.5, we expand on this work to develop and validate Ludwig AutoML for text classification.

Config validation against Marshmallow Schemas

The combiner and trainer sections of Ludwig configurations are now validated against official Marshmallow schemas. This centralizes documentation, flags configuration typos or bad values, and helps catch regressions.

Better Test Coverage

We've quadrupled the number of unit and integration tests and we've established new testing guidelines for well-tested contributions going forward. This strengthens Ludwig's stability, iterability, and helps build confidence in new changes.

Backward Compatibility

Despite all of the code changes, we've worked hard to ensure that Ludwig’s simple interface remains consistent and compatible with earlier releases as much as possible. A few minor parameter naming changes in the Ludwig configuration to be aware of:

  • training -> trainer
  • numeric -> number
  • fc_size -> output_size
  • tied_weights -> tied
  • deleted {weight/bias/activation}_regularizer -> A global regularization_lambda and regularization_type is used to control regularization across the entire model.
  • delete dropout: True/False -> dropout is float [0,1]

Finally, we've dropped support for Python 3.6. Please use Python 3.7 going forward.

New Contributors

ludwig - v0.5rc2

Published by justinxzhao over 2 years ago

Fixes loss reporting consistency issues, and shape-based metric calculation errors with SET output features.

ludwig - v0.5rc1

Published by ShreyaR over 2 years ago

Migration to PyTorch.

Summary

This release features experimental AutoML with auto config generation and auto-training integrated with hyperopt on RayTune, and integrations with Ray training and Ray datasets. We're still working on a comprehensive overhaul of the documentation, and all the new functionality will all available in the upcoming v0.5 too.

Aside from critical bugs and new datasets, v0.4.1 will be the last release of Ludwig using TensorFlow. Starting with v0.5+ (release coming soon), Ludwig will use PyTorch as the backend for tensor computation. We will release a blogpost detailing the rationale and impact of this decision, but we wanted to do one last TensorFlow release to make sure that all those committed to a TensorFlow ecosystem that have used Ludwig so far could enjoy the benefits of many bug fixes and improvements we did on the codebase that were not specific to PyTorch.

The next version v0.5 will also have several additional improvements that we’ll be excited to share in the coming weeks.

Additions

Improvements

Bug fixes

Other changes and things to note

New Contributors

Full Changelog: https://github.com/ludwig-ai/ludwig/compare/v0.4...v0.4.1

Changelog

Additions

  • Integrate ray tune into hyperopt (#1001)
  • Added Ames Housing Kaggle dataset (#1098)
  • Added functionality to obtain subtrees in the SST dataset (#1108)
  • Added comparator combiner (#1113)
  • Additional Text Classification Datasets (#1121)
  • Added Ray remote backend and Dask distributed preprocessing (#1090)
  • Added TabNet combiner and needed modules (#1062)
  • Added Higgs Boson dataset (#1157)
  • Added GitHub workflow to push to Docker Hub (#1160)
  • Added more tagging schemes for Docker images (#1161)
  • Added Docker build matrix (#1162)
  • Added category feature > 1 dim to TabNet (#1150)
  • Added timeseries datasets (#1149)
  • Add TabNet Datasets (#1153)
  • Forest Cover Type, Adult Census Income and Rossmann Store Sales datasets (#1165)
  • Added KDD Cup 2009 datasets (#1167)
  • Added Ray GPU image (#1170)
  • Added support for cloud object storage (S3, GCS, ADLS, etc.) (#1164)
  • Perform inference with Dask when using the Ray backend (#1128)
  • Added schema validation to config files (#1186)
  • Added MLflow experiment tracking support (#1191)
  • Added export to MLflow pyfunc model format (#1192)
  • Added MLP-Mixer image encoder (#1178)
  • Added TransformerCombiner (#1177)
  • Added TFRecord support as a preprocessing cache format (#1194)
  • Added higgs boson tabnet examples (#1209)

Improvements

  • Abstracted Horovod params into the Backend API (#1080)
  • Added allowed_origins to serving to support to allow cross-origin requests (#1091)
  • Added callbacks to hook into the training loop programmatically (#1094)
  • Added scheduler support to Ray Tune hyperopt and fixed GPU usage (#1088)
  • Ray Tune: enforced that epochs equals max_t and early stopping is disabled (#1109)
  • Added register_trainable logic to RayTuneExecutor (#1117)
  • Replaced Travis CI with GitHub Actions (#1120)
  • Split distributed tests into separate test suite (#1126)
  • Removed unused regularizer parameter from training defaults
  • Restrict docker built GA to only ludwig-ai repos (#1166)
  • Harmonize return object for categorical, sequence generator and sequence tagger (#1171)
  • Sourcing images from either file path or in-memory ndarrays (#1174)
  • Refactored hyperopt results into object structure for easier programmatic usage (#1184)
  • Refactored all contrib classes to use the Callback interface (#1187)
  • Improved performance of Dask preprocessing by adding parallelism (#1193)
  • Improved TabNetCombiner and Concat combiner (#1177)
  • Added additional backend configuration options (#1195)
  • Made should_shuffle configurable in Trainer (#1198)

Bugfixes

  • Fix SST parentheses issue
  • Fix serve.py adding a try around the form parsing (#1111)
  • Fix #1104: add lengths to text encoder output with updated unit test (#1105)
  • Fix sst2 substree logic to match glue sst2 dataset (#1112)
  • Fix #1078: Avoid recreating cache when using image preproc (#1114)
  • Fix checking is dask exists in figure_data_format_dataset
  • Fixed bug in EthosBinary dataset class and model directory copying logic in RayTuneReportCallback (#1129)
  • Fix #1070: error when saving model with image feature (#1119)
  • Fixed IterableBatcher incompatibility with ParquetDataset and remote model serialization (#1138)
  • Fix: passing backend and TF config parameters to model load path in experiment
  • Fix: improved TabNet numerical stability + refactoring
  • Fix #1147: passing bn_epsilon to AttentiveTransformer initialization in TabNet
  • Fix #1093: loss value mismatch (#1103)
  • Fixed CacheManager to correctly handle test_set and validation_set (#1189)
  • Fixing TabNet sparsity loss issue (#1199)

Breaking changes

Most models trained with v0.3.3 would keep working in v0.4.
The main changes in v0.4 are additional options, so what worked previously should not be broken now.
One exception to this is that now there is a much strictier check of the validity of the model configuration.
This is great as it allows to catch errors earlier, although configurations that despite errors worked in the past may not work anymore.
The checks should help identify the issues in the configurations though, so errors should be easily ficable.

Contributors

@tgaddair @jimthompson5802 @ANarayan @kaushikb11 @mejackreed @ronaldyang @zhisbug @nimz @kanishk16

Package Rankings
Top 4.38% on Proxy.golang.org
Top 2.29% on Pypi.org
Badges
Extracted from project README
PyPI version Discord DockerHub Downloads License X Open In Colab Open In Colab Open In Colab Open In Colab Open In Colab Star History Chart
Related Projects