metaflow

Build and manage real-life ML, AI, and data science projects with ease!

APACHE-2.0 License

Downloads
903.6K
Stars
7.5K
Committers
79

Bot releases are visible (Hide)

metaflow - 2.10.5

Published by saikonen 12 months ago

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.10.4...2.10.5

metaflow - 2.10.4

Published by saikonen 12 months ago

Features

Support for tracing

With this release it is possible to gather telemetry data using an opentelemetry endpoint.

Specifying an endpoint in one of the environment variables

  • METAFLOW_OTEL_ENDPOINT
  • METAFLOW_ZIPKIN_ENDPOINT

will enable the corresponding tracing provider.

Some additional dependencies are required for the tracing functionality in the execution environment. These can be installed in the base Docker image, or supplied through a conda environment. The relevant packages are

opentelemetry-sdk, opentelemetry-api, opentelemetry-instrumentation, opentelemetry-instrumentation-requests

and depending on your endpoint, either opentelemetry-exporter-otlp or opentelemetry-exporter-zipkin

Custom index support for the pypi decorator

The pypi decorator now supports using a custom index in the users Pip configuration under global.index-url.
This enables using private indices, even ones that require authentication.

For example the following would set up one authenticated and two extra non-authenticated indices for package resolution

pip config set global.index-url "https://user:[email protected]"
pip config set global.extra-index-url "https://extra.example.com https://extra2.example.com"

Specify Kubernetes job ephemeral storage size through resources decorator

It is now possible to specify the ephemeral storage size for Kubernetes jobs when using the resources decorator with the disk= attribute.

Introduce argo-workflows status command

Adds a command for easily checking the current status of a workflow on Argo workflows.

python flow.py argo-workflows status [run-id]

Improvements

Add more randomness to Kubernetes pod names to avoid collisions

There was an issue where relying solely on the Kubernetes apiserver for generating random pod names was resulting in significant collisions with sufficiently large number of executions.

This release adds more randomness to the pod names besides what is generated by Kubernetes.

Fix issues with resources decorator in combination with step functions

This release fixes an issue where deploying flows on AWS Step Functions was failing in the following cases

  • @resources(shared_memory=) with any value
  • combining @resources and @batch(use_tmpfs=True)

What's Changed

New Contributors

Full Changelog: https://github.com/Netflix/metaflow/compare/2.10.3...2.10.4

metaflow - 2.10.3

Published by saikonen about 1 year ago

What's Changed

New Contributors

Full Changelog: https://github.com/Netflix/metaflow/compare/2.10.2...2.10.3

metaflow - 2.10.2

Published by oavdeev about 1 year ago

2.10.2

Features

Full Changelog: https://github.com/Netflix/metaflow/compare/2.10.0...2.10.2

metaflow - 2.10.2

Published by oavdeev about 1 year ago

What's Changed

Features

Full Changelog: https://github.com/Netflix/metaflow/compare/2.10.0...2.10.2

metaflow - 2.10.0

Published by savingoyal about 1 year ago

metaflow - 2.9.15

Published by romain-intel about 1 year ago

Improvements

Improve the performance of parallel_map

We now check for processes in the order in which they complete not in the order in which they are launched. This also increases the likelihood of failing fast.

Fix issues with the environment escape mechanism

Deadlocks and errors could occur when using the environment escape mechanism in two cases: (a) GC would occur at an inopportune moment or (b) subprocesses were involved. Both issues were fixed.

What's Changed

New Contributors

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.14...2.9.15

metaflow - 2.9.14

Published by saikonen about 1 year ago

Improvements

Fixes merging of log lines

This release fixes an issue with merging broken log lines.

Fix issue with using LD_LIBRARY_PATH with Conda environments

In a Conda environment, it is sometimes necessary to set LD_LIBRARY_PATH to first include the Conda's environment libraries before anything else. Prior to this release, this used to cause issues with the escape hatch.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.13...2.9.14

metaflow - 2.9.13

Published by savingoyal about 1 year ago

Bug fix

Revert annotations changes to fix a regression

The recent annotations feature introduced an issue where project, flow_name or user annotations are not being populated for Kubernetes. This release reverts the changes.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.12...2.9.13

metaflow - 2.9.12

Published by saikonen about 1 year ago

Known issues

The annotations feature introduced in this release has an issue where project, flow_name or user annotations are not being populated for Kubernetes. This has been reverted in the next release.

Features

Custom annotations for K8S and Argo Workflows

This release enables users to add custom annotations to the Kubernetes resources that Flows create. The annotations can be configured much in the same way as custom labels

  1. Globally with an environment variable. For example with
export METAFLOW_KUBERNETES_ANNOTATIONS="first=A,second=B"
  1. At a step level by passing a dictionary to the Kubernetes decorator.
@kubernetes(annotations={"first": "A", "second": "B"})

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.11...2.9.12

metaflow - 2.9.11

Published by savingoyal over 1 year ago

Bug Fix

Fix regression for @batch decorator introduced by v2.9.10

This release reverts a validation fix introduced in 2.9.10, which prevented executions of Metaflow tasks on AWS Batch

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.10...2.9.11

metaflow - 2.9.10

Published by saikonen over 1 year ago

Features

Introduce PagerDuty support for workflows running on Argo Workflows

With this release, Metaflow users can get events on PagerDuty when their workflows succeed or fail on Argo Workflows.
Setting up the notifications is similar to the existing Slack notifications support

  1. Follow these instructions on PagerDuty to set up an Events API V2 integration for your PagerDuty service
  2. You should be able to view the required integration key from the Events API V2 dropdown
  3. To enable notifications on PagerDuty when your Metaflow flow running on Argo Workflows succeeds or fails, deploy it using the --notify-on-error or --notify-on-success flags:
python flow.py argo-workflows create --notify-on-error --notify-on-success --notify-pager-duty-integration-key <pager-duty-integration-key>
  1. You can also set the following environment variable instead of specifying --notify-slack-webhook-url on the CLI everytime
METAFLOW_ARGO_WORKFLOWS_CREATE_NOTIFY_PAGER_DUTY_INTEGRATION_KEY=<pager-duty-integration-key>
  1. Next time the flow fails or succeeds, you should receive a new event on PagerDuty under Incidents (Flow failed) or Changes (Flow succeeded)

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.9...2.9.10

metaflow - 2.9.9

Published by saikonen over 1 year ago

Improvements

Fixes a bug with the S3 operations affecting @conda with some S3 providers

This release fixes a bug with the @conda bootstrapping process. There was an issue with the ServerSideEncryption support, that affected some of the S3 operations when using S3 providers that do not implement the encryption headers (for example MinIO).
Affected operations were all that handle multiple files at once:

get_many / get_all / get_recursive / put_many / info_many

which are used as part of bootstrapping a @conda environment when executing remotely.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.8...2.9.9

metaflow - 2.9.8

Published by saikonen over 1 year ago

Improvements

Fixes bug with Argo events parameters

This release fixes an issue with mapping values with spaces from the Argo events payload to flow parameters.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.7...2.9.8

metaflow - 2.9.7

Published by saikonen over 1 year ago

Features

New commands for managing Argo Workflows through the CLI

This release includes new commands for managing workflows on Argo Workflows.
When needed, commands can be authorized by supplying a production token with --authorize.

argo-workflows delete

A deployed workflow can be deleted through the CLI with

python flow.py argo-workflows delete

argo-workflows terminate

A run can be terminated mid-execution through the CLI with

python flow.py argo-workflows terminate RUN_ID

argo-workflows suspend/unsuspend

A run can be suspended temporarily with

python flow.py argo-workflows suspend RUN_ID

Note that the suspended flow will show up as failed on Metaflow-UI after a period, due to this also suspending the heartbeat process. Unsuspending will resume the flow and its status will show as running again. This can be done with

python flow.py argo-workflows unsuspend RUN_ID

Improvements

Faster Job completion checks for Kubernetes

Previously the status for tasks running on Kubernetes was determined through the pod status, which can take a while to update after the last container finishes. This release changes the status checks to use container statuses directly instead.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.6...2.9.7

metaflow - 2.9.6

Published by saikonen over 1 year ago

Features

AWS Step Function state machines can now be deleted through the CLI

This release introduces the command step-functions delete for deleting state machines through the CLI.

For a regular flow

python flow.py step-functions delete

For another users project branch

Comment out the @project decorator from the flow file, as we do not allow using --name with projects.

python project_flow.py step-functions --name project_a.user.saikonen.ProjectFlow delete

For a production or custom branch flow

python project_flow.py --production step-functions delete
# or
python project_flow.py --branch custom step-functions delete

add --authorize PRODUCTION_TOKEN to the command if you do not have the correct production token locally

Improvements

Fixes a bug with the S3 server side encryption feature with some S3 compliant providers.

This release fixes an issue with the S3 server side encryption support, where some S3 compliant providers do not respond with the expected encryption method in the payload. This bug specifically affected regular operation when using MinIO.

Fixes support for --with environment in Airflow

Fixes a bug with the Airflow support for environment variables, where the env values set in the environment decorator could get overwritten.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.5...2.9.6

metaflow - 2.9.5

Published by saikonen over 1 year ago

Features

Ability to choose server side encryption method for S3 uploads

There is now the possibility to choose which server side encryption method to use for S3 uploads by setting an environment variable METAFLOW_S3_SERVER_SIDE_ENCRYPTION with an appropriate value, for example aws:kms or AES256

Improvements

Fixes double quotes with Parameters on Argo Workflows

This release fixes an issue where using parameters on Argo Workflows caused the values to be unnecessarily quoted.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

New Contributors

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.4...2.9.5

metaflow - 2.9.4

Published by saikonen over 1 year ago

Improvements

Fix using email addresses as usernames for Argo Workflows

Using an email address as the username when deploying with a @project decorator to Argo Workflows is now possible. This release fixes an issue with some generated resources containing characters that are not permitted in names of Argo Workflow resources.

The secrets decorator now supports assuming roles

This release adds the capability to assume specific roles when accessing secrets with the @secrets decorator. The role for accessing a secret can be defined in the following ways

As a global default

By setting the METAFLOW_DEFAULT_SECRET_ROLE environment variable, this role will be assumed when accessing any secret specified in the decorator.

As a global option in the decorator

This will assume the role secret-iam-role for accessing all of the secrets in the sources list.

@secrets(
  sources=["first-secret-source", "second-secret-source"],
  role="secret-iam-role"
)

Or on a per secret basis

Assuming a different role based on the secret in question can be done as well

@secrets(
  sources=[
    {"type": "aws-secrets-manager", "id": "first-secret-source", "role": "first-secret-role"},
    {"type": "aws-secrets-manager", "id": "second-secret-source", "role": "second-secret-role"}
  ]
)

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.3...2.9.4

metaflow - 2.9.3

Published by romain-intel over 1 year ago

Improvements

Ignore duplicate Metaflow Extensions packages

Duplicate Metaflow Extensions packages were not properly ignored in all cases. This release fixes this and will allow the loading of extensions even if they are present in duplicate form in your sys.path.

Fix package leaks for the environment escape

In some cases, packages from the outside environment (non Conda) could leak into the Conda environment when using the environment escape functionality. This release addresses this issue and ensures that no spurious packages are imported in the Conda environment.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.

What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.2...2.9.3

metaflow - 2.9.2

Published by savingoyal over 1 year ago

Features

Introduce support for image pull policy for @kubernetes

With this release, Metaflow users can specify image pull policy for their workloads through the @kubernetes decorator for Metaflow tasks.

@kubernetes(image='foo:tag', image_pull_policy='Always') # Allowed values are Always, IfNotPresent, Never
@step
def train(self):
    ... 
    ...

If an image pull policy is not specified, and the tag for the container image is :latest or the tag for the container image is not specified, image pull policy is automatically set to Always.

If an image pull policy is not specified, and the tag for the container image is specified as a value that is not :latest, image pull policy is automatically set to IfNotPresent.

In case you need any assistance or have feedback for us, ping us at chat.metaflow.org or open a GitHub issue.


What's Changed

Full Changelog: https://github.com/Netflix/metaflow/compare/2.9.1...2.9.2