prefect

Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines

APACHE-2.0 License

Downloads
8.4M
Stars
14.5K
Committers
148

Bot releases are hidden (Show)

prefect - Release 1.2.3

Published by zanieb over 2 years ago

Changelog

Enhancements

  • Add support for ExtraArgs on S3 result upload - #5887
  • Add configuration options for the client's rate limit backoff - #5823

Fixes

  • Allow untracked flows to be run during a tracked flow run - #5782
  • Fix bug with infinite loop when parsing DST cron schedules - #5957
  • Fix issue where complex Python dependencies could break Docker storage builds - #5860
  • Fix issue where Git storage could not be serialized without a repo - #5877
  • Fix issues with grpcio builds on the Prefect base image with Python 3.10 - #5832

Task library

  • Instantiate task kwargs during DbtCloudRunTask.__init__ - #5831
  • Add boto_kwargs support to S3List - #5907
  • Fix DatabricksSubmitMultitaskRun inputs such as access_control_list - #5836
  • Add single_user_name to Databricks NewCluster - #5903
  • Allow extra arguments to Databricks NewCluster - #5949
  • Add git_source argument to DatabricksSubmitMultitaskRun - #5958
  • Add tasks for the Toloka API - #5865
  • Add tasks for Azure datafactory - #5921
  • Update the DbtShellTask return type to match ShellTask - #5872
  • Fix handling for Airbyte schedule keys - #5878

Contributors

prefect - Release 1.2.2

Published by zanieb over 2 years ago

Changes

Enhancements

  • Add inference of Docker network mode for "host" and "none" networks - #5748
  • Add Python 3.10 support - #5770
  • Raise error on task initialiation when positional-only parameters are present in function signature - #5789
  • Add flag to prevent printing ASCII welcome message 5619
  • Allow the Prefect client to retry connections for HTTP targets - #5825

Task Library

  • Adds SFTP server tasks SftpUpload and SftpDownload #1234
  • Configure logging output for AirbyteConnectionTask - #5794
  • Make artifacts optional in StartFlowRun - #5795
  • Use json instead of dict for DatabricksSubmitMultitaskRun - #5728
  • Fix defect in serialization of Great Expectation's results in LocalResult - #5724
  • Add an optional data_security_mode to Databricks cluster configuration. - #5778

Fixes

  • Fix bug where Prefect signals in tasks were not re-raised by the process-based timeout handler - #5804
  • Update flow builds to be deterministic when upstream and downstream slug are same - #5785

Contributors

prefect - Release 1.2.1

Published by zanieb over 2 years ago

Changes

Enhancements

  • Add ability to set a max_duration timeout in wait_for_flow_run task - #5669
  • Add pipe support for EdgeAnnotation types, e.g. map - #5674
  • Add 'gs' as a valid filesystem schema for reading specifications - #5705
  • Add REPL mode for CLI - #5615

Fixes

  • Fix bug where settings the backend to "server" would not prevent client from requesting secrets from the API - #5637
  • Fix docker-in-docker issue in DockerAgent on Windows - #5657
  • Fix graphviz syntax error when visualizing a flow with a task which is a mapped lambda - #5662
  • Allow prefect run parameters to include equals ("=") signs - #5716

Task library

  • Add HightouchRunSync task - #5672
  • Fix DbtCloudRunJob task failing with nested input for additional_args - #5706"
  • Fix Databricks new cluster API params: autoscale and policy_id - #5681

Contributors

prefect - Release 1.2.0

Published by zanieb over 2 years ago

Changes

Features

  • Add retry_on to allow tasks to retry on a subset of exception types - #5634

Enhancements

  • Add ability to add capacity provider for ECS flow runs - #4356
  • Add support for default values to DateTimeParameter - #5519
  • Calling flow.run within a flow definition context will raise a RuntimeError - #5588
  • Add support for service principal and managed identities for storage on Azure - #5612

Task Library

  • The azureml-sdk dependency has been moved from the azure extra into azureml - #5632
  • Add task to create materializations with Transform - #5518
  • Add create_bucket to GCSCopy - #5618

Fixes

  • Fix issue where the FlowRunView could fail to initialize when the backend has no state data - #5554
  • Fix issue where adaptive Dask clusters failed to replace workers - #5549
  • Fix issue where logging in to Cloud via the CLI could fail - #5643

Contributors

prefect - Release 1.1.0

Published by zanieb over 2 years ago

Changes

Thanks to our many contributors!

Features

  • Add .pipe operator to prefect.Task for functional chaining - #5507
  • Add Kubernetes authentication support to VaultSecret - #5412

Enhancement

  • Allow tasks to consume self as an argument - #5508
  • Improve the default idempotency key for create_flow_run task when mapping during a local flow run - #5443

Fixes

  • Fix the broken URL displayed in entrypoint.sh - #5490
  • Fix zombie processes created by Hasura container during prefect server start - #5476

Task Library

  • Add Airbyte configuration export task - #5410
  • Update Glob task to accept a string path - #5499
  • Fix pod logging while using RunNamespacedJob - #5514
  • Add include_generated_sql option to CubeJSQueryTask - #5471

Contributors

prefect - Release 1.0.0

Published by zanieb over 2 years ago

🎉

See the latest documentation and our release blog post.

Highlights

  • Authentication with tokens has been removed; use API keys instead. - #4643
  • Python 3.6 is no longer supported; use Python 3.7+ instead. - #5136
  • Flow Environments have been removed; use RunConfigs instead. - #5072, docs
  • We have a new Discourse community to encourage lasting discussions.

Breaking Changes

  • The AWS Fargate agent has been removed; use the ECS agent instead. - #3812
  • DockerAgent(docker_interface=...) will now raise an exception if passed. - #4446
  • Agents will no longer check for authentication at the prefect.cloud.agent.auth_token config key. - #5140
  • Executors can no longer be imported from prefect.engine.executors; use prefect.executors instead. - #3798
  • Parameter is not importable from prefect.core.tasks anymore; use prefect.Parameter instead.
  • Exceptions are no longer importable from prefect.utilities.exceptions; use prefect.exceptions instead. - #4664
  • Client.login_to_tenant has been renamed to Client.switch_tenant.
  • The prefect register flow command has been removed; use prefect register instead. - #4256
  • The prefect run flow command has been removed; use prefect run instead. - #4463
  • Authentication token CLI commands create-token, revoke-token, list-tokens have been removed; use API keys instead. - #4643
  • prefect auth login no longer accepts authentication tokens. - #5140
  • prefect auth purge-tokens has been added to delete the Prefect-managed tokens directory. - #5140
  • The log_to_cloud setting is now ignored; use send_flow_run_logs instead. - #4487

Enhancements

  • Update LocalDaskExecutor to use new Python futures feature. - #5046
  • Add a State.__sizeof__ implementation to include the size of its result for better scheduling. - #5304
  • Allow the cancellation event check to be disabled in the DaskExecutor. - #5443
  • Update Flow.visualize() to allow change in orientation. - #5472
  • Allow ECS task definition role ARNs to override ECS agent defaults. - #5366

Task Library

  • Add DatabricksGetJobID to retreive Databricks job IDs with a given name. - #5438
  • Add AWSParametersManager task to retrieve value from AWS Systems Manager Parameter Store. - #5439
  • Update SpacyNLP task to support spacy version >= 3.0. - #5358
  • Add exclude parameter to SpacyNLP task. - #5402
  • Update the AWSSecretsManager task to parse non key-value type secrets. - #5451
  • Update the DatabricksRunNow task to use the Databricks 2.1 jobs API. - #5395
  • Add ge_checkpoint and checkpoint_kwargs parameters to RunGreatExpectationsValidation to allow runtime configuration of checkpoint runs. - #5404
  • Add support for overwriting existing blobs when using Azure BlobStorageUpload task. - #5437
  • Add Neo4jRunCypherQueryTask task for running Cypher queries against Neo4j databases. - #5418
  • Add DatabricksSubmitMultitaskRun task to run Databricks jobs with multiple Databricks tasks. - #5395

Fixes

  • Add support to prefect.flatten for non-iterable upstreams, including exceptions and signals. - #4084
  • While building Docker images for storage, rm=True is used as default, which deletes intermediate containers. - #5384
  • Use __all__ to declare Prefect's public API for Pyright. - #5293
  • Fix usage of sys.getsizeof to restore support for PyPy. - #5390
  • Fix issues with log size estimates from #5316. - #5390

Contributors

prefect - 1.0 Release Candidate 1

Published by zanieb over 2 years ago

See the list of changes in the changelog.

prefect - Release 0.15.13

Published by zanieb over 2 years ago

Changes

Enhancements

  • Ensure that maximum log payload sizes are not exceeded - #5316

Server

  • Upgrade Hasura to v2.1.1 which includes support for Apple M1 - #5335

Fixes

  • Fix bug where logout was required before logging in with a new key if the new key does not have access to the old tenant - #5355

Task Library

  • Fix bug where the Airbyte sync job failure would not be reflected in the task state - #5362
prefect - Release 0.15.12

Published by zanieb almost 3 years ago

Changes

Enhancements

  • Allow passing timedeltas to create_flow_run to schedule subflows at runtime - #5303
  • Upgrade Prefect Server Hasura image to 2.0.9 - #5173
  • Allow client retries on failed requests to Prefect Server - #5292

Task Library

  • Add authentication parameter for Snowflake query tasks - #5173
  • Add Mixpanel tasks - #5276
  • Add Zendesk Tickets Incremental Export task - #5278
  • Add Cube.js Query task - #5280
  • Add Monte Carlo lineage tasks - #5256
  • Add Firebolt task - #5265
  • Add custom domain support to dbt Cloud tasks for enterprise customers - #5273
  • Fix response key in Airbyte task health check - #5314
  • Allow all Postgres task parameters to be configured at runtime - #4377
  • Fix AirbyteConnectionTask requiring optional parameters - #5260
  • Allow StepActivate task to receive runtime parameters - #5231

Fixes

  • Fix bug where null run_config field caused deserialization errors in backend views - #1234

Contributors

prefect - Release 0.15.11

Published by zanieb almost 3 years ago

Changes

Released on December 22, 2021.

Enhancements

  • Allow passing kwargs to Merge task constructor via merge() function - #5233
  • Allow passing proxies to slack_notifier - #5237

Fixes

  • Update RunGreatExpectationsValidation task to work with latest version of great_expectations - #5172
  • Allow unsetting kubernetes imagePullSecrets with an empty string - #5001
  • Improve agent handling of kubernetes jobs for flow runs that have been deleted - #5190
  • Remove beta1 from kubernetes agent template - #5194
  • Documentation improvements - #5220, #5232, #5288

Contributors

prefect - Release 0.15.10

Published by zanieb almost 3 years ago

Changelog

Released on November 30, 2021.

Enhancements

  • Add end_time to FlowRunView.get_logs - #5138
  • Update watch_flow_run to stream logs immediately instead of waiting for flow run state changes - #5138
  • Allow setting container ports for DockerRun - #5130
  • Clarify ECSRun documentation, especially the ambiguities in setting IAM roles - #5110
  • Fix deprecated usage of marshmallow.fields.Dict in RRule schedules - #4540, #4903

Fixes

  • Fix connection to local server instances when using DockerAgent on linux - #5182

Task Library

  • Add support for triggering Airbyte connection sync jobs using AirbyteConnectionTask - #5078
  • Add artifact publishing to DbtCloudRunJob task - #5135
  • Add support for running data quality checks on Spark DataFrames using soda-spark - #4901

Contributors

prefect - Release 0.15.9

Published by zanieb almost 3 years ago

This hotfix release fixes an issue where the kubernetes agent would attempt to load a secret value and fail if it was not present.

See the PR for details.

Don't miss all the exciting changes from 0.15.8 released today as well.

prefect - Release 0.15.8

Published by zanieb almost 3 years ago

Features

  • Add support for rich iCal style scheduling via RRules - #4901
  • Add Google Cloud Vertex agent and run configuration - #4989

Enhancements

  • Allow Azure flow storage to overwrite existing blobs - #5103
  • Provide option to specify a dockerignore when using Docker storage - #4980
  • Add keep-alive connections for kubernetes client API connections - #5066
  • Add idempotency_key to create_flow_run task - #5125
  • Add raise_final_state to wait_for_flow_run task to reflect child flow run state - #5129

Task Library

  • Bump maximum google-cloud-bigquery version to support 2.x - #5084
  • Add Glob task for collecting files in directories - #5077
  • Add DbtCloudRunJob task for triggering dbt cloud run jobs - #5085
  • Added Kafka Tasks entry to website docs - #5094

Fixes

  • Update the FlowView to be more robust to serialized flow changes in the backend - #5116

Deprecations

  • Move artifacts functions to prefect.backend.artifacts - #5117

Server

This release includes a Prefect Server update that updates an upstream dependency to fix a security vulnerability. See the release changelog for more details.

Contributors

prefect - Release 0.15.7

Published by zanieb almost 3 years ago

Enhancements

  • Add flatten support to apply_map - #4996
  • Add dask performance report to DaskExecutor - #5032
  • Update git storage repo parameter to be optional if specifying git_clone_url_secret_name - #5033
  • Add task_run_name to prefect.context - #5055

Fixes

  • Reduce rate limit related failures with the ECS agent - #5059

Task Library

  • Add data parameter to SQLiteQuery task - #4981
  • Allow EmailTask to use insecure internal SMTP servers with smtp_type="INSECURE" - #5012
  • Fix Databricks run_id mutation during task runs - #4958
  • Add manual setting to FivetranSyncTask allowing retention of Fivetan scheduling -#5065

Contributors

prefect - Release 0.15.6

Published by zanieb about 3 years ago

Enhancements

  • Improve setting the Azure storage connection string - #4955
  • Allow disabling retries for a task with max_retries=0 when retries are globally configured - #4971

Task Library

  • Allow Exasol Tasks to handle Prefect Secrets directly - #4436
  • Adding Census Syncs to the task library - #4935

Fixes

  • Fix bug where LocalDaskExecutor did not respond to a PrefectSignal - #4924
  • Fix PostgresFetch with headers for one row - #4968
  • Fix bug where apply_map could create acyclic flows - #4970

Contributors

prefect - Release 0.15.5

Published by zanieb about 3 years ago

Released on September 2, 2021.

Features

  • Python 3.9 docker images are now published - #4896

Enhancements

  • Add --expose flag to prefect server cli to make the Core server and UI listen to all interfaces - #4821
  • Pass existing/local environment variables to agentless flow runs - #4917
  • Add --idempotency-key to prefect run - #4928
  • Add support for larger flow registration calls - #4930
  • Ignore schedules by default for CLI flow runs and add flag to run based on schedule for local only runs #4817

Task Library

  • Feature: Added SnowflakeQueryFromFile task #3744
  • Enhancement: Log boto exceptions encountered in the in AWS BatchSubmit task - #4771
  • Breaking: Legacy Dremio authentication has been updated to the new pattern in DremioFetch - #4872
  • Fix: Use runtime arguments over init arguments instead of ignoring them for MySQL Tasks - #4907

Fixes

  • Adjust log limits to match backend logic for better UX - #4900
  • Fix use of marshmallow.fields.Dict to use keys as a kwarg rather than key. - #4903
  • API server settings are passed correctly to task workers when using Prefect Server - #4914
  • Do not attempt attempt to set host_gateway if using an unsupported Docker Engine version - #4809
  • Ignore jobs without a flow_run_id label in KubernetesAgent.manage_jobs - #4934

Breaking Changes

  • Services run by prefect server cli are now local by default (listen to localhost instead of 0.0.0.0); use --expose if you want to connect from a remote location - #4821
  • The changes in flow registration require Prefect Server 2021.09.02. Prefect Server will need to be upgraded before flows can be registered from this version - #4930

Contributors

prefect - Release 0.15.4

Published by zanieb about 3 years ago

Changelog

0.15.4

Released on August 17, 2021.

Docs

  • Add a getting started section with a quick start guide for both core and orchestration sections - #4734

Enhancements

  • Expose Snowflake cursor type to SnowflakeQuery task arguments #4786
  • Add ability to use threaded flow heartbeats - #4844
  • Improve behavior when API rate limits are encountered - #4852
  • Allow custom git clone url for Git storage - #4870
  • Add on_worker_status_changed callback to the DaskExecutor - #4874
  • Add --agent-config-id to prefect agent <kubernetes|local> install - #4876

Task Library

  • Add new prometheus task to push to gateway - #4623

Fixes

  • Fix binding of named volumes to flow containers with Docker agent - #4800
  • Fix ImportError typo in dropbox module - #4855
  • Fix default safe char for gitlab storage repo path - #4828

Contributors

prefect - Release 0.15.3

Published by zanieb about 3 years ago

Enhancements

  • Add new evaluation_parameters parameter to RunGreatExpectationsValidation task - #4798

Fixes

  • Fix create_flow_run compatibility with auth tokens - #4801
  • Fix auto-quoting for strings that begin with numeric characters - #4802

Contributors

prefect - Release 0.15.2

Published by zanieb about 3 years ago

Enhancements

  • Allow CLI registration of flows without starting their schedule prefect register --no-schedule - #4752
  • Add host_config to DockerRun to expose deeper settings for Docker flow runs - #4733
  • Enable loading additional repository files with Git storage - #4767
  • Update flow run heartbeats to be robust to exceptions - #4736
  • Allow prefect build/register paths to contain globs for recursion - #4761

Fixes

  • Fix duplicate task runs in FlowRunView.get_all_task_runs - #4774
  • Fix zombie processes from exited heartbeats - #4733
  • Missing auth_file directory is created when saving credentials - #4792

Task Library

  • Add VaultSecret task for retrieving secrets from private Vault instances - #4656

Contributors

prefect - Release 0.15.1

Published by zanieb over 3 years ago

Enhancements

  • Add documentation for querying role and membership info - #4721
  • Checkpoint task results when SUCCESS is raised - #4744

Fixes

  • Fix loading of PREFECT__CLOUD__API_KEY environment variable when starting agents - #4751
  • Fix bug where the tenant could not be inferred during flow runs while using token auth - #4758
  • Fix bug where an agent using token auth could clear the tenant from disk - #4759