prefect

Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines

APACHE-2.0 License

Downloads
8.4M
Stars
14.5K
Committers
148

Bot releases are visible (Hide)

prefect - Contrib Work Makes the Dream Work

Published by joshmeek over 4 years ago

Changelog

0.12.3

Released on July 8, 2020.

Enhancements

  • Update flow.slugs during flow.replace - #2919
  • flow.update accepts the optional kwarg merge_parameters that allows flows to be updated with common Parameters - #2501
  • Added poke handler to notify agent process of available flow runs - #2914
  • Add Cancelling state for indicating a flow-run that is being cancelled, but may still have tasks running - #2923

Task Library

  • Add ReadAirtableRow task - #2843
  • Add container_name kwarg to CreateContainer Docker task - #2904
  • Adds an extra_docker_kwargs argument to CreateContainer Docker task - #2915

Fixes

  • Fix issue with short-interval IntervalClocks that had a start_date far in the past - #2906
  • When terminating early, executors ensure all pending work is cancelled/completed before returning, ensuring no lingering background processing - #2920

Contributors

prefect - Maps? Maps.

Published by joshmeek over 4 years ago

Changelog

0.12.2

Released on June 30, 2020.

Features

  • Add apply_map, a function to simplify creating complex mapped pipelines - #2846

Enhancements

  • Make storage location inside Docker storage configurable - #2865
  • Send heartbeats on each iteration of the Cloud task runner's retry loop - #2893

Task Library

  • Add option to BigQueryTask to return query as dataframe - #2862

Fixes

  • Add more context keys when running locally so that templating is consistent between local and Cloud runs - #2662
  • Fix Fargate agent not parsing string provided containerDefinitions - #2875
  • Fix Fargate agent providing empty parameters if not set - #2878
  • Fix issue with Queued task runs flooding agents with work - #2884
  • Add missing prefect register flow to CLI help text - #2895

Contributors

prefect - Hot Reloading Flows

Published by joshmeek over 4 years ago

0.12.1

Released on June 25, 2020.

Features

  • Task slugs are now stable across rebuilds of the same Flow - #2531
  • Support configuring executors for LocalEnvironment, KubernetesJobEnvironment, and FargateTaskEnvironment - #2805
  • Flows can now be stored and executed using file-based storage - #2840

Enhancements

  • Add option to set repositoryCredentials on Fargate Agent containerDefinitions - #2822
  • Update GraphQL endpoint to /graphql - #2669
  • Allow Cloud Flow Runners to interact properly with Queued runs - #2741
  • Add Result serializers - #2755
  • Simplify DaskExecutor internals - #2817
  • Set task names in LocalDaskExecutor - #2819
  • Flows registered without an image set will default to all_extras - #2828
  • Improve error message when sending unauthorized requests to Cloud - #2810
  • Forward state change status back to core - #2839
  • Add GitHub storage for storing flows as files in a GitHub repo - #2840
  • Add prefect register flow CLI command for registering flows from files - #2840
  • Add default GITHUB_ACCESS_TOKEN secret - #2840
  • Create utility function for getting Kubernetes client - #2845

Task Library

  • Adds a MySQL task using pymysql driver - #2124
  • Add some tasks for working with Google Sheets - #2614
  • Add support for HTML content in the EmailTask - #2811

Server

  • Failing to set a state raises errors more aggressively - #2708

Fixes

  • Fix all_extras tag not being set during CI job to build image - #2801
  • Quiet no candidate Cached states were valid debug logging - #2815
  • Fix LocalEnvironment execute function's use of the flow object - #2804
  • Properly set task names when using DaskExecutor - #2814
  • Fix the LocalDaskExecutor to only compute tasks once, not multiple times - #2819
  • Generate key names for mapped tasks that work better with Dask's dashboard - #2831
  • Fix FlowRunTask when running against locally deployed Server - #2832
  • Make sure image from Docker storage is always used with KubernetesJobEnvironment - #2838
  • Change Environment.run_flow() to prefer executor from flow's environment - #2849

Deprecations

  • Deprecate RemoteEnvironment in favor of LocalEnvironment - #2805
  • Deprecate RemoteDaskEnvironment in favor of LocalEnvironment with a DaskExecutor - #2805
  • Deprecate executor_kwargs in KubernetesJobEnvironment and FargateTaskEnvironment in favor of executor - #2805

Breaking Changes

  • Remove previously deprecated SynchronousExecutor - #2826

Contributors

prefect - Into the Depths

Published by joshmeek over 4 years ago

Changelog

0.12.0

Released on June 17, 2020.

Features

  • Depth First Execution with Mapping on Dask - #2646
  • Support use of cloud storage with containerized environments - #2517,#2796

Enhancements

  • Add flag to include hostname on local storage - #2653
  • Add option to set image_pull_secret directly on DaskKubernetesEnvironment - #2657
  • Allow for custom callables for Result locations - #2577
  • Ensure all Parameter values, included non-required defaults, are present in context - #2698
  • Use absolute path for LocalResult location for disambiguation - #2698
  • Retry client requests when receiving an API_ERROR code in the response - #2705
  • Reduce size of serialized tasks when running on Dask - #2707
  • Extend run state signatures for future development - #2718
  • Update set_flow_run_state for future meta state use - #2725
  • Add an optional flow argument to merge to support using it when not inside a flow context - #2727
  • Add option to set service account name on Prefect jobs created by Kubernetes agent - #2547
  • Add option to set imagePullPolicy on Prefect jobs created by Kubernetes agent - #2721
  • Add option to set API url on agent start CLI command - #2633
  • Add CI step to build prefecthq/prefect:all_extras Docker image for bundling all Prefect dependencies - #2745
  • Move Parameter to a standalone module - #2758
  • Validate Cached states based on hashed inputs - #2763
  • Add validate_configuration utility to Fargate Agent for verifying it can manage tasks properly - #2768
  • Add option to specify task targets as callables - #2769
  • Improve State.__repr__ when there is no message - #2773
  • Add support for db argument at run time in the SQLiteQuery and SQLiteScript - #2782
  • Add support for mapped argument in control flows - #2784
  • Use pagination in kubernetes resource manager to reduce memory usage - #2794

Task Library

  • Adds a task to expose Great Expectations checkpoints as a node in a Prefect pipeline - #2489

Fixes

  • Fix flow.visualize cleanup of source files when using filename - #2726
  • Fix S3Result handling of AWS credentials provided through kwargs - #2747
  • Fix DaskKubernetesEnvironment requiring that an env block is set when using custom specs - #2657
  • Fix PostgresExecute task auto commit when commit is set to False - #2658
  • Remove need for {filename} in mapped templates - #2640
  • Fix issue with Results erroring out on multi-level mapped pipelines - #2716
  • Fix issue with dask resource tags not being respected - #2735
  • Ensure state deserialization works even when another StateSchema exists - #2738
  • Remove implicit payload size restriction from Apollo - #2764
  • Fix issue with declared storage secrets in K8s job environment and Dask K8s environment - #2780
  • Fix context handling for Cloud when working with in-process retries - #2783

Deprecations

  • Accessing prefect.core.task.Parameter is deprecated in favor of prefect.core.parameter.Parameter - #2758

Breaking Changes

  • Environment setup and execute function signatures now accept Flow objects - #2796
  • create_flow_run_job logic has been moved into execute for DaskKubernetesEnvironment and KubernetesJobEnvironment - #2796

Contributors

prefect - Please Sign Here

Published by joshmeek over 4 years ago

Changelog

0.11.5

Released on June 2, 2020.

Enhancements

  • Allow for manual approval of locally Paused tasks - #2693
  • Task instances define a __signature__ attribute, for improved introspection and tab-completion - #2602
  • Tasks created with @task forward the wrapped function's docstring - #2602
  • Support creating temporary dask clusters from within a DaskExecutor - #2667
  • Add option for setting any build kwargs on Docker storage - #2668
  • Add flow run ID option to get logs CLI command - #2671
  • Add ID to output of get command for flows and flow-runs - #2671

Fixes

  • Fix issue with Google imports being tied together - #2661
  • Don't warn about unused tasks defined inline and copied - #2677
  • Remove unnecessary volume mount from dev infrastructure Docker compose - #2676
  • Fix issue with instantiating LocalResult on Windows with dir from other drive - #2683
  • Fix invalid IP address error when running server start on Ubuntu using rootless Docker - #2691

Deprecations

  • Deprecate local_processes and **kwargs arguments for DaskExecutor - #2667
  • Deprecate address='local' for DaskExecutor - #2667

Contributors

prefect - Revert GraphQL Endpoint Change

Published by joshmeek over 4 years ago

Changelog

0.11.4

Released on May 27, 2020.

Fixes

  • Revert GraphQL endpoint change - #2660
prefect - It Is A Hodgepodge My Dudes

Published by joshmeek over 4 years ago

Changelog

0.11.3

Released on May 27, 2020.

Enhancements

  • Add option to set volumes on server start CLI command - #2560
  • Add case to top-level namespace - #2609
  • Use host IP for hostname label in cases where LocalAgent is in container using host network - #2618
  • Add option to set TLS configuration on client created by Docker storage - #2626
  • The start_time of a Paused state defaults to None - #2617
  • Raise more informative error when Cloud Secret doesn't exist - #2620
  • Update GraphQL endpoint to /graphql - #2651

Fixes

  • Kubernetes agent resource manager is more strict about what resources it manages - #2641
  • Fix error when adding Parameter to flow under case statement - #2608
  • Fix S3Result attempting to load data when checking existence - #2623

Deprecations

  • Deprecate private_registry and docker_secret options on DaskKubernetesEnvironment - #2630

Breaking Changes

  • Kubernetes labels associated with Prefect flow runs now have a prefect.io/ prefix (e.g. prefect.io/identifier) - #2641

Contributors

prefect - A Fix in the Changelog is Worth Two in the Code

Published by joshmeek over 4 years ago

Changelog

0.11.2

Released on May 19, 2020.

Enhancements

  • Allow log configuration in Fargate Agent - #2589
  • Reuse prefect.context for opening Flow contexts - #2581
  • Show a warning when tasks are created in a flow context but not added to a flow - #2584

Server

  • Add API healthcheck tile to the UI - #2395

Fixes

  • Fix type for Dask Security in RemoteDaskEnvironment - #2571
  • Fix issue with log_stdout not correctly storing returned data on the task run state - #2585
  • Ensure result locations are updated from targets when copying tasks with task_args - #2590
  • Fix S3Result exists function handling of NoSuchKey error - #2585
  • Fix confusing language in Telemetry documentation - #2593
  • Fix LocalAgent not registering with Cloud using default labels - #2587
  • Fix flow's run_agent function passing a set of labels to Agent instead of a list - #2600

Contributors

prefect - Fix Duplicate Agent Label Parsing

Published by joshmeek over 4 years ago

Changelog

0.11.1

Released on May 15, 2020.

Fixes

  • Fix duplicate agent label literal eval parsing - #2569
prefect - Improved APIs, Who Dis?

Published by joshmeek over 4 years ago

Changelog

0.11.0

Released on May 14, 2020.

Features

  • Introducing new Results interface for working with task results - #2507

Enhancements

  • Allow slack_task to accept a dictionary for the message parameter to build a specially-structured JSON Block - #2541
  • Support using case for control flow with the imperative api - #2546
  • flow.visualize is now able to accept a format argument to specify the output file type - #2447
  • Docker storage now writes flows to /opt dir to remove need for root permissions - #2025
  • Add option to set secrets on Storage objects - #2507
  • Add reserved default Secret names and formats for working with cloud platforms - #2507
  • Add unique naming option to the jobs created by the KubernetesJobEnvironment - #2553
  • Use ast.literal_eval for configuration values - #2536
  • Prevent local cycles even if flow validation is deferred - #2565

Server

  • Add "cancellation-lite" semantic by preventing task runs from running if the flow run isn't running - #2535
  • Add minimal telemetry to Prefect Server - #2467

Task Library

  • Add tasks to create issues for Jira and Jira Service Desk #2431
  • Add DbtShellTask, an extension of ShellTask for working with data build tool (dbt) - #2526
  • Add prefect.tasks.gcp.bigquery.BigQueryLoadFile - #2423

Fixes

  • Fix bug in Kubernetes agent deployment.yaml with a misconfigured liveness probe - #2519
  • Fix checkpointing feature not being able to be disabled when using server backend - #2438

Deprecations

  • Result Handlers are now deprecated in favor of the new Result interface - #2507

Breaking Changes

  • Allow for setting docker daemon at build time using DOCKER_HOST env var to override base_url in docker storage - #2482
  • Ensure all calls to flow.run() use the same execution logic - #1994
  • Moved prefect.tasks.cloud to prefect.tasks.prefect - #2404
  • Trigger signature now accepts a dictionary of [Edge, State] to allow for more customizable trigger behavior - #2298
  • Remove all uses of credentials_secret from task library in favor of PrefectSecret tasks - #2507
  • Remove Bytes and Memory storage objects - #2507

Contributors

prefect - Checkpointing Core Before Big Checkpointing Change

Published by joshmeek over 4 years ago

Changelog

0.10.7

Released on May 6, 2020.

Features

  • None

Enhancements

  • Agents now support an optional HTTP health check, for use by their backing orchestration layer (e.g. k8s, docker, supervisord, ...) - #2406
  • Enhance agent verbose logs to include provided kwargs at start - #2486
  • Add no_cloud_logs option to all Agent classes for an easier way to disable sending logs to backend - #2484
  • Add option to set flow run environment variables on Kubernetes agent install - #2424
  • Sets dask scheduler default to "threads" on LocalDaskExecutor to provide parallelism - #2494

Task Library

  • Add new case control-flow construct, for nicer management of conditional tasks - #2443

Fixes

  • Give a better error for non-serializable callables when registering with cloud/server - #2491
  • Fix runners retrieving invalid context.caches on runs started directly from a flow runner - #2403

Deprecations

  • None

Breaking Changes

  • Remove the Nomad agent - #2492

Contributors

  • None
prefect - ⁰.₁₀.⁶

Published by joshmeek over 4 years ago

Changelog

0.10.6

Released on May 5, 2020.

Features

  • Add DaskCloudProviderEnvironment to dynamically launch Dask clusters, e.g. on AWS Fargate - #2360

Enhancements

  • Add botocore_config option to Fargate agent for setting botocore configuration when interacting with boto3 client - #2170
  • Don't create a None task for a null condition when using ifelse - #2449
  • Add support for EC2 launch type in Fargate Agent and FargateTaskEnvironment - #2421
  • Add flow_id to context for Flow runs - #2461
  • Allow users to inject custom context variables into their logger formats - #2462
  • Add option to set backend on agent install CLI command - #2478

Task Library

  • None

Fixes

  • Fix start_server.sh script when an env var is undefined - #2450
  • Fix server start CLI command not respecting version kwarg on tagged releases - #2435
  • Fix issue with non-JSON serializable args being used to format log messages preventing them from shipping to Cloud - #2407
  • Fix issue where ordered Prefect collections use lexical sorting, not numerical sorting, which can result in unexpected ordering - #2452
  • Fix issue where Resource Manager was failing due to non-JSON timestamp in log writing - #2474
  • Fix periodic error in local agent process management loop - #2419

Deprecations

  • None

Breaking Changes

  • None

Contributors

prefect - It Just Keeps Getting Better

Published by joshmeek over 4 years ago

Changelog

0.10.5

Released on Apr 28, 2020.

Features

  • None

Enhancements

  • Added serializer for RemoteDaskEnvironment - #2369
  • server start CLI command now defaults to image build based on current Prefect installation version - #2375
  • Add option to set executor_kwargs on KubernetesJobEnvironment and FargateTaskEnvironment - #2258
  • Add map index to task logs for mapped task runs - #2402
  • Agents can now register themselves with Cloud for better management - #2312
  • Adding support for environment, secrets, and mountPoints via configurable containerDefinitions to the Fargate Agent - #2397
  • Add flag for disabling Docker agent interface check on Linux - #2361

Task Library

  • Add Pushbullet notification task to send notifications to mobile - #2366
  • Add support for Docker volumes and filtering in prefect.tasks.docker - #2384

Fixes

  • Fix Docker storage path issue when registering flows on Windows machines - #2332
  • Fix issue with refreshing Prefect Cloud tokens - #2409
  • Resolve invalid escape sequence deprecation warnings - #2414

Deprecations

  • None

Breaking Changes

  • None

Contributors

prefect - Unmerge Previous Merge to Fix Extraneous Merge

Published by joshmeek over 4 years ago

Changelog

0.10.4

Released on Apr 21, 2020.

Enhancements

  • Agent connection step shows which endpoint it is connected to and checks API connectivity - #2372

Breaking Changes

  • Revert changes to ifelse & switch (added in #2310), removing implicit
    creation of merge tasks - #2379
prefect - Easy-Bake Endpoint

Published by joshmeek over 4 years ago

Changelog

0.10.3

Released on Apr 21, 2020.

Features

  • None

Enhancements

  • Allow GraphQL endpoint configuration via config.toml for remote deployments of the UI - #2338
  • Add option to connect containers created by Docker agent to an existing Docker network - #2334
  • Expose datefmt as a configurable logging option in Prefect configuration - #2340
  • The Docker agent configures containers to auto-remove on completion - #2347
  • Use YAML's safe load and dump commands for the server start CLI command - #2352
  • New RemoteDaskEnvironment specifically for running Flows on an existing Dask cluster - #2367

Task Library

  • None

Fixes

  • Fix auth create-token CLI command specifying deprecated role instead of scope - #2336
  • Fix local schedules not continuing to schedule on errors outside of runner's control - #2133
  • Fix get_latest_cached_states pulling incorrect upstream cached states when using Core server as the backend - #2343

Deprecations

  • None

Breaking Changes

  • None

Contributors

prefect - Can We Fix It? Yes We Can!

Published by joshmeek over 4 years ago

Changelog

0.10.2

Released on Apr 14, 2020.

Features

  • None

Enhancements

  • Task logical operators (e.g. And, Or, ...) no longer implicitly cast to bool - #2303
  • Allow for dynamically changing secret names at runtime - #2302
  • Update ifelse and switch to return tasks representing the output of the run branch - #2310

Task Library

  • Rename the base secret tasks for clarity - #2302

Fixes

  • Fix possible subprocess deadlocks when sending stdout to subprocess.PIPE - #2293, #2295
  • Fix issue with Flow registration to non-standard Cloud backends - #2292
  • Fix issue with registering Flows with Server that have required scheduled Parameters - #2296
  • Fix interpolation of config for dev services CLI for Apollo - #2299
  • Fix pytest Cloud and Core server backend fixtures - #2319
  • Fix AzureResultHandler choosing an empty Secret over provided connection string - #2316
  • Fix containers created by Docker agent on Linux not being able to reach out to host API - #2324

Deprecations

  • None

Breaking Changes

  • Remove env_var initialization from EnvVarSecret in favor of name - #2302

Contributors

prefect - Port of Call

Published by joshmeek over 4 years ago

Changelog

0.10.1

Released on Apr 7, 2020.

Features

  • CI build for prefect server images - #2229, #2275
  • Allow kwargs to boto3 in S3ResultHandler - #2240

Enhancements

  • Add flags to prefect server start for disabling service port mapping - #2228
  • Add options to prefect server start for mapping to host ports - #2228
  • Return flow_run_id from CLI run methods for programmatic use - #2242
  • Add JSON output option to describe CLI commands - #1813
  • Add ConstantResult for eventually replacing ConstantResultHandler - #2145
  • Add new diagnostics mode for timing requests made to Cloud - #2283

Task Library

  • Make project_name optional for FlowRunTask to allow for use with Prefect Core's server - #2266
  • Adds prefect.tasks.docker.container.RemoveContainer

Fixes

  • Fix S3ResultHandler safe retrieval of _client attribute - #2232
  • Change default log timestamp value in database to be identical to other tables instead of a hard coded value - #2230

Deprecations

  • None

Breaking Changes

  • None

Contributors

prefect - Open Source Database Backend, GraphQL API and UI

Published by joshmeek over 4 years ago

Changelog

0.10.0

Released on Mar 29, 2020.

Features

  • Open source database backend, GraphQL API and UI - #2218
  • Add prefect server start CLI command for spinning up database and UI - #2214

Enhancements

  • Add ValidationFailed state and signal in anticipation of validating task outputs - #2143
  • Add max polling option to all agents - #2037
  • Add GCSResult type #2141
  • Add Result.validate method that runs validator functions initialized on Result #2144
  • Convert all GraphQL calls to have consistent casing - #2185 #2198
  • Add prefect backend CLI command for switching between Prefect Core server and Prefect Cloud - #2203
  • Add prefect run server CLI command for starting flow runs without use of project name - #2203
  • Make project_name optional during flow registration to support Prefect Core's server - #2203
  • Send flow run and task run heartbeat at beginning of run time - #2203

Task Library

  • None

Fixes

  • Fix issue with heartbeat failing if any Cloud config var is not present - #2190
  • Fix issue where run cloud CLI command would pull final state before last batch of logs - #2192
  • Fix issue where the S3ResultHandler would attempt to access uninitialized attribute - #2204

Deprecations

  • None

Breaking Changes

  • Drop support for Python 3.5 - #2191
  • Remove Client.write_run_log - #2184
  • Remove Client.deploy and flow.deploy - #2183

Contributors

  • None
prefect - WFH Release

Published by joshmeek over 4 years ago

Changelog

0.9.8

Released on Mar 18, 2020.

Features

  • None

Enhancements

  • Update Cloud config name for heartbeat settings - #2081
  • Add examples to Interactive API Docs - #2122
  • Allow users to skip Docker healthchecks - #2150
  • Add exists, read, and write interfaces to Result #2139
  • Add Cloud UI links to Slack Notifications - #2112

Task Library

  • None

Fixes

  • Fix S3ResultHandler use of a new boto3 session per thread - #2108
  • Fix issue with stateful function reference deserialization logic mutating state - #2159
  • Fix issue with DateClock serializer - #2166
  • Fix issue with scheduling required parameters - #2166

Deprecations

  • Deprecate cache_* and result_handler options on Task and Flow objects #2140

Breaking Changes

  • None

Contributors

prefect - Update Task Attribute Retrieval

Published by joshmeek over 4 years ago

Changelog

0.9.7

Released on Mar 4, 2020.

Fixes

  • Change task.log_stdout retrieval from task runner to getattr in order to preserve running flows of older 0.9.x versions - #2120