zenml

ZenML 🙏: Build portable, production-ready MLOps pipelines. https://zenml.io.

APACHE-2.0 License

Downloads
44.5K
Stars
3.6K

Bot releases are hidden (Show)

zenml - 0.3.4

Published by htahir1 over 3 years ago

This release is a big design change and refactor. It involves a significant change in the Configuration file structure, meaning this is a breaking upgrade.

For those upgrading from an older version of ZenML, we ask to please delete their old pipelines dir and .zenml folders and start afresh with a zenml init.

If only working locally, this is as simple as:

cd zenml_enabled_repo
rm -rf pipelines/
rm -rf .zenml/

And then another ZenML init:

pip install --upgrade zenml
cd zenml_enabled_repo
zenml init

New Features

  • Introduced another higher-level pipeline: The NLPPipeline. This is a generic
    NLP pipeline for a text-datasource based training task. Full example of how to use the NLPPipeline can be found here
  • Introduced a BaseTokenizerStep as a simple mechanism to define how to train and encode using any generic
    tokenizer (again for NLP-based tasks).
  • Introduced a new HuggingFace integration, with the first concrete implementation of the BaseTokenizerStep, i.e., the HuggingFaceTokenizer.
  • Show-cased how to use HuggingFace with the ZenML TrainerStep in the NLP Example.

Bug Fixes + Refactor

  • Significant change to imports: Now imports are way simpler and user-friendly. E.g. Instead of:
from zenml.core.pipelines.training_pipeline import TrainingPipeline

A user can simple do:

from zenml.pipelines import TrainingPipeline

The caveat is of course that this might involve a re-write of older ZenML code imports.

Note: Future releases are also expected to be breaking. Until announced, please expect that upgrading ZenML versions may cause older-ZenML generated pipelines to behave unexpectedly.

Special shout-out to @nicholasmaiot for major contributions to this release!

zenml - 0.3.3

Published by bcdurak over 3 years ago

This release is a significant one as it includes the first version of the AWS integration. It allows you to use ZenML to launch an EC2 instance as an orchestrator and execute a ZenML pipeline possibly coupled with an S3 artifact store and RDS metadata store.

It is a new feature and it does not include any breaking changes.

In order to install ZenML with the AWS integration attached, you can follow:

pip install --upgrade zenml[aws]
zenml init

New Features

  • OrchestratorAWSBackend implemented to launch an EC2 instance as the orchestrator.
  • While you are using the new orchestrator backend, you may use S3 and RDS.
  • Implemented an example which covers the basic process if you would like to start testing it right away.

Bug Fixes + Refactor

  • For more advanced use-cases, more examples will follow in the future.
  • Numerous small bugs and refinements.
zenml - 0.3.2

Published by htahir1 over 3 years ago

Earlier release to get the PostgreSQL datasource out quicker.

To upgrade:

pip install --upgrade zenml

New Features

Bug Fixes + Refactor

  • Slight change to telemetry utils -> Now opt-out also sends a signal.
zenml - 0.3.1

Published by htahir1 over 3 years ago

This release is a big design change and refactor. It involves a significant change in the Configuration file structure, meaning this is a breaking upgrade. For those upgrading from 0.2.0, we ask to please delete their old pipelines dir and .zenml folders and start afresh with a zenml init.

If only working locally, this is as simple as:

cd zenml_enabled_repo
rm -rf pipelines/
rm -rf .zenml/

And then another init:

pip install --upgrade zenml
zenml init

New Features

Bug Fixes + Refactor

  • Now you can run pipelines from within any subdirectory in the repo.
  • Relaxed restriction on custom steps having sub-directories with their module.
  • Relationship between Datasource and Data Step refined.
  • Numerous small bugs and refinements to facilitate flexible API design.

Note: Future releases are also expected to be breaking. Until announced, please expect that upgrading ZenML versions may cause older-ZenML generated pipelines to behave unexpectedly.

zenml - 0.2.0

Published by htahir1 over 3 years ago

This new release is a major one. Its the first to introduce our new integrations system, which is meant to be used to extend ZenML with various other ML/MLOps libraries easily. The first big advantage one gets is 🚀 PyTorch Support 🚀!

pip install --upgrade zenml

And to enable the PyTorch extension:

pip install zenml[pytorch]

New Features

  • Introduced integrations for ZenML with the extra_requires setuptools paradigm.
  • Added PyTorchTrainer support with easily extendable TorchBaseTrainer example.
  • Restructured trainer steps to be more intuitive to extend from Tensorflow and PyTorch. Now, we have a TrainerStep, followed by TFBaseTrainerStep and TorchBaseTrainerStep.
  • The input_fn of the TorchTrainer have implemented in a way that it can ingest from a tfrecords file. This marks one of the few projects out there
    that have native support for ingesting the TFRecords format into PyTorch directly.

Bug Fixes

  • Fixed an issue with Repository.get_zenml_dir() that caused any pipeline creates below root level to fail on creation.

Documentation Annoucement

The docs are almost complete! We are at 80% completion. Keep an eye out as we update with more details on how to use/extend ZenML and let us know via slack if there is something missing!

zenml - 0.1.5

Published by htahir1 over 3 years ago

New Features

  • Added Kubernetes Orchestrator to run pipelines on a kubernetes cluster.
  • Added timeseries support with StandardSequencerStep.
  • Added more [CLI groups] such as step, datasource and pipelines. E.g. zenml pipeline list gives list of pipelines in current repo.
  • Completed a significant portion of the Docs.
  • Refactored Step Interfaces for easier integrations into other libraries.
  • Added a GAN Example to showcase ImageDatasource.
  • Set up base for more Trainer Interfaces like PyTorch, scikit etc.
  • Added ability to see historical steps.

Bug Fixes

  • All files except YAML files picked up while parsing pipelines_dir, in reference to concerns raised in #13.

Upcoming changes

  • Next release will be a major one and will involve refactoring of design decisions that might cause backward incompatible changes to existing ZenML repos.
zenml - 0.1.4

Published by htahir1 almost 4 years ago

0.1.4

New Features

  • Ability to add a custom image to Dataflow ProcessingBackend.

Bug Fixes

  • Fixed requirements.txt and setup.py to enable local build.
  • Pip package should install without any requirement conflicts now.
  • Added custom docs made by Jupyter book in the docs/book folder.
zenml - 0.1.3

Published by hamzamaiot almost 4 years ago

New Features

  • Launch GCP preemptible VM instances to orchestrate pipelines with OrchestratorGCPBackend. See full example here.
  • Train using Google Cloud AI Platform with SingleGPUTrainingGCAIPBackend. See full example here
  • Use Dataflow for distributed preprocessing. See full example here.
  • Run pipelines locally with SQLite Metadata Store, local Artifact Store, and local Pipelines Directory.
  • Native Git integration: All steps are pinned with the Git SHA of the code when the pipelines it was used in is run. See details here.
  • All pipelines run are reproducible with a unique combination of the Metadata Store, Artifact Store and the Pipelines Directory.

Bug Fixes

  • Metadata Store and Artifact Store specified in pipelines disassociated from default .zenml_config file.
  • Fixed typo in default docker images constants.