prefect

Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines

APACHE-2.0 License

Downloads
8.4M
Stars
14.5K
Committers
148

Bot releases are visible (Hide)

prefect - Remember when Python 2 was a thing?

Published by cicdw almost 4 years ago

Changelog

Released on November 17, 2020.

Enhancements

  • Experimental support for Python 3.9 - #3411

Fixes

  • Fixes Flow.replace freezing reference tasks - #3655
  • Fixed bug where flow.serialized_hash() could return inconsistent values across new python instances - #3654

Contributors

prefect - Why dig up artifacts when you can create them?

Published by joshmeek almost 4 years ago

Changelog

0.13.15

Released on November 11, 2020.

Features

  • Add API for storing task run artifacts in the backend - #3581

Enhancements

  • Allow for setting Client headers before loading tenant when running with Prefect Server - #3515
  • Checkpoint all iterations of Looped tasks - #3619
  • Add ref option to GitHub storage for specifying branches other than master - #3638
  • Added ExecuteNotebook task for running Jupyter notebooks - #3599
  • Pass day_or croniter argument to CronClock and CronSchedule - #3612
  • Client.create_project and prefect create project will skip creating the project if the project already exists - #3630
  • Update deployments extension to AppsV1Api - #3637
  • PrefectSecret and EnvVarSecret tasks no longer require secret names be provided at flow creation time - #3641

Fixes

  • Fix issue with retrying mapped pipelines on dask - #3519
  • Task arguments take precedence when generating task_run_name - #3605
  • Fix breaking change in flow registration with old server versions - #3642
  • Task arguments take precedence when generating templated targets and locations - #3627

Breaking Changes

  • Environment variable config values now parse without requiring escaping backslashes - #3603

Contributors

prefect - Now with twice as much Fargate

Published by jcrist almost 4 years ago

Changelog

0.13.14

Released on November 5, 2020.

Features

  • flow.register accepts an idempotency key to prevent excessive flow versions from being created - #3590
  • Added flow.serialized_hash() for easy generation of hash keys from the serialized flow - #3590

Enhancements

  • Add option to select cursor_type for MySQLFetch task - #3574
  • Add new ECSAgent and ECSRun run config - #3585
  • Display exception information on prefect create project failure - #3589
  • prefect diagnostics no longer displays keys that have values matching the default config - #3593
  • Allow use of multiple image pull secrets in KubernetesAgent, DaskKubernetesEnvironment - #3596
  • Added FROM to explicitly chain exceptions in src/prefect/tasks/twitter - #3602
  • Add UTC offset to default logging.datefmt; logging timestamp converter now follows Python default behavior - #3607
  • Improve error message when API responds with 400 status code - #3615

Deprecations

  • Deprecate prefect agent start <kind> in favor of prefect agent <kind> start - #3610
  • Deprecate prefect agent install <kind> in favor of prefect agent <kind> install - #3610

Contributors

Changelog

0.13.13

Released on October 27, 2020.

Enhancements

  • Don't stop execution if the task runner fails to load a cached result - #3378
  • Add option to specify networkMode for tasks created by the Fargate Agent - #3546
  • Allows to schedule flow runs at an arbitrary time with StartFlowRun - #3573

Fixes

  • Use BlobServiceClient instead of BlockBlobService to connect to azure blob in azure tasks - #3562
  • Tasks with log_stdout=True work with non-utf8 output - #3563

Contributors

prefect - There's so much in here, I don't have a name

Published by joshmeek almost 4 years ago

Changelog

0.13.12

Released on October 20, 2020.

Enhancements

  • Agents now submit flow runs in order of scheduled start times - #3165
  • Updating k8s tutorial docs to include instructions on how to provide access to S3 from kubernetes deployments on AWS - #3200
  • Adds option to specify default values for GetItem and GetAttr tasks - #3489
  • Allow disabling default storage labels for the LocalAgent - #3503
  • Improve overall functionality of docs search, full list of changes here - #3504
  • Add LocalRun implementation for run_config based flows - #3527
  • Add DockerRun implementation for run_config based flows - #3537
  • Raise a better error message when trying to register a flow with a schedule using custom filter functions - #3450
  • RenameFlowRunTask: use default flow_run_id value from context - #3548
  • Raise a better error message when trying to register a flow with parameters with JSON-incompatible defaults - #3549

Task Library

  • Extended GCSUpload task to allow uploading of bytes/gzip data - #3507
  • Allow setting runtime webook_secret on SlackTask and kwarg for webhook_secret retrieved from PrefectSecret task - #3522

Fixes

  • Fix get flow-runs and describe flow-runs CLI commands querying of removed duration field - #3517
  • Fix multiprocess based timeout handler on linux - #3526
  • Fix API doc generation incorrectly compiling mocked imports - #3504
  • Fix multiprocessing scheduler failure while running tasks with timeouts - #3511
  • Update Fargate task definition validation - #3514
  • Fix bug in k8s where the resource-manager would sometimes silently crash on errors - #3521
  • Add labels from flow.storage for run_config based flows - #3527
  • Fix LocalAgent PYTHONPATH construction on Windows - #3551

Deprecations

  • FlowRunTask, RenameFlowRunTask, and CancelFlowRunTask have been renamed to StartFlowRun, RenameFlowRun, and CancelFlowRun respectively - #3539

Contributors

prefect - More external contributors than Prefect has engineers

Published by jcrist about 4 years ago

Changelog

0.13.11

Released on October 14, 2020.

Features

  • Allow for schedules that emit custom Flow Run labels - #3483

Enhancements

  • Use explicit exception chaining - #3306
  • S3List filtering using the LastModified value - #3460
  • Add Gitlab storage - #3461
  • Extend module storage capabilities - #3463
  • Support adding additional flow labels in prefect register flow - #3465
  • Strict Type for default value of a Parameter - #3466
  • Enable automatic script upload for file-based storage when using S3 and GCS - #3482
  • Allow for passing labels to client.create_flow_run - #3483
  • Display flow group ID in registration output URL instead of flow ID to avoid redirect in UI - #3500
  • Add informative error log when local storage fails to load flow - #3475

Task Library

  • Add cancel flow run task - #3484
  • Add new BatchSubmit task for submitting jobs to AWS batch - #3366
  • Add new AWSClientWait task for waiting on long-running AWS jobs - #3366
  • Add GetAttr task - #3481

Fixes

  • Fix default profile directory creation behavior - #3037
  • Fix DaskKubernetesEnvironment overwriting log attributes for custom specs - #3231
  • Fix default behavior for dbt_kwargs in the dbt task to provide an empty string - #3280
  • Fix containerDefinitions environment validation - #3452
  • Raise a better error when calling flow.register() from within a Flow context - #3467
  • Fix task cancellation on Python 3.8 to properly interrupt long blocking calls - #3474

Contributors

prefect - What's in a name?

Published by cicdw about 4 years ago

Changelog

0.13.10

Released on October 6, 2020.

Enhancements

  • Add option to template task run name at runtime when using backend API - #2100
  • Add set_task_run_name Client function - #2100
  • Use from to explicitly chain exceptions - #3306
  • Update error message when registering flow to non-existant project - #3418
  • Add flow.run_config, an experimental design for configuring deployed flows - #3333
  • Allow python path in Local storage - #3351
  • Enable agent registration for server users - #3385

Task Library

  • Add keypair auth for snowflake - #3404
  • Add new RenameFlowRunTask for renaming a currently running flow - #3285.

Fixes

  • Fix mypy typing for target kwarg on base Task class - #2100
  • Fix Fargate Agent not parsing cpu and memory provided as integers - #3423
  • Fix MySQL Tasks breaking on opening a context - #3426

Contributors

prefect - Small But Mighty

Published by joshmeek about 4 years ago

Changelog

0.13.9

Released on September 29, 2020.

Features

  • Allow for scheduling the same flow at the same time with multiple parameter values - #2510

Enhancements

  • Adopt explicit exception chaining in more places - #3306
  • Add DateTimeParameter - #3327

Task Library

  • New task for the task library to create an item in Monday - #3387
  • Add option to specify run_name for FlowRunTask - #3393

Contributors

prefect - Practice makes prefect

Published by jcrist about 4 years ago

0.13.8

Released on September 22, 2020.

Enhancements

  • Allow passing context values as JSON string from CLI - #3347
  • Allow copying of directories into Docker image - #3299
  • Adds schedule filters for month end or month start and specific day - #3330
  • Support configuring executor on flow, not on environment - #3338
  • Support configuring additional docker build commands on Docker storage - #3342
  • Support submission retries within the k8s agent - #3344
  • Expose flow_run_name to flow.run() for local runs - #3364

Task Library

  • Add contributing documentation for task library - #3360
  • Remove duplicate task library documentation in favor of API reference docs - #3360

Fixes

  • Fix issue with constants when copying Flows - #3319
  • Fix DockerAgent with --show-flow-logs to work on windows/osx (with python >= 3.8) - #3339
  • Fix mypy type checking for tasks created with prefect.task - #3346
  • Fix bug in flow.visualize() where no output would be generated when running with PYTHONOPTIMIZE=1 - #3352
  • Fix typo in DaskCloudProviderEnvironment logs - #3354

Deprecations

  • Deprecate the use of the /contrib directory - #3360
  • Deprecate importing Databricks and MySQL tasks from prefect.contrib.tasks, should use prefect.tasks instead - #3360

Contributors

prefect - How many devs does it take to fix some bugs?

Published by cicdw about 4 years ago

0.13.7

Released on September 16, 2020.

Enhancements

  • Use explicit exception chaining #3306
  • Quiet Hasura logs with prefect server start - #3296

Fixes

  • Fix issue with result configuration not being respected by autogenerated tasks - #2989
  • Fix issue with result templating that failed on task arguments named 'value' - #3034
  • Fix issue restarting Mapped pipelines with no result- #3246
  • Fix handling of Prefect Signals when Task state handlers are called - #3258
  • Allow using apply_map under a case or resource_manager block - #3293
  • Fix bug with interaction between case blocks and Constant tasks which resulted in some tasks never skipping - #3293
  • Fix bug in DaskExecutor where not all client timeouts could be configured via setting distributed.comm.timeouts.connect - #3317

Task Library

  • Adds a compression argument to both S3Upload and S3Download, allowing for compression of data upon upload and decompression of data upon download - #3259

Contributors

prefect - I <3 Contributors

Published by joshmeek about 4 years ago

Changelog

0.13.6

Released on September 9, 2020.

Enhancements

  • Adds logger to global context to remove friction on running task unit tests - #3256
  • Expand FunctionTask AttributeError Message - #3248
  • Add backend info to diagnostics - #3265
  • Ellipsis Support for GraphQL DSL - #3268

Task Library

  • Add DatabricksRunNow task for running Spark jobs on Databricks - #3247
  • Add GitHub CreateIssueComment task - #3269
  • Add S3List task for listing keys in an S3 bucket - #3282
  • Add boto_kwargs to AWS tasks - #3275

Fixes

  • Make identifier optional in KubernetesAgent.replace_job_spec_yaml() - #3251
  • Change https://localhost to http://localhost in the welcome message - #3271

Contributors

prefect - Incrementally better than 0.13.4

Published by jcrist about 4 years ago

Changelog

0.13.5

Released on September 1, 2020.

Enhancements

  • Begin storing the width of mapped pipelines on the parent Mapped state - #3233
  • Kubernetes agent now manages lifecycle of prefect jobs in its namespace - #3158
  • Move agent heartbeat to background thread - #3158
  • Handles ModuleNotFound errors in the storage healthcheck - #3225
  • Raises the warnings.warn stack level to 2 to reduce duplicate warning messages - #3225
  • Add some extra output to the client.register print output for visibility - #3225
  • CLI help text docstrings are now auto documented using the API documentation parser - #3225
  • DaskExecutor now logs dask worker add/removal events - #3227

Fixes

  • Fix issue with passing --env-vars flag to K8s Agent Install manifest - #3239
  • Fix edge case with add_edge method - #3230

Deprecations

  • Kubernetes resource manager is now deprecated and the functionality is moved into the Kubernetes agent - #3158

Contributors

prefect - 5 Enhancements, 1 new Task, 3 Bug Fixes

Published by cicdw about 4 years ago

Changelog

0.13.4

Released on August 25, 2020.

Enhancements

  • Allow for setting path to a custom job YAML spec on the Kubernetes Agent - #3046
  • Use better coupled versioning scheme for Core / Server / UI images - #3204
  • Added option to mount volumes with KubernetesAgent - #1234
  • Add more kwargs to State.children and State.parents for common access patterns - #3212
  • Reduce size of prefecthq/prefect Docker image - #3215

Task Library

  • Add DatabricksSubmitRun task for submitting Spark jobs on Databricks - #3166

Fixes

  • Fix Apollo service error output while waiting for GraphQL service with prefect server start - #3150
  • Fix --api CLI option not being respected by agent Client - #3186
  • Fix state message when using targets - #3216

Contributors

prefect - A Log Log Time Ago

Published by joshmeek about 4 years ago

Changelog

0.13.3

Released on August 18, 2020.

Enhancements

  • Make use of kubernetes extra logger in the DaskKubernetesEnvironment optional - #2988
  • Make Client robust to simplejson - #3151
  • Raise Warning instead of Exception during storage healthcheck when Result type is not provided - #3146
  • Add server create-tenant for creating a tenant on the server - #3147
  • Cloud logger now responds to logging level - #3179

Task Library

  • Add support for host_config and arbitrary keyword arguments in Docker tasks - #3173

Fixes

  • Fix empty string imagePullSecrets issue on AKS by removing if not set - #3142
  • Fix querying for cached states with no cache_key - #3168
  • Fix access to core_version in Agent's get_flow_run_command() - #3177

Breaking Changes

  • DaskKubernetesEnvironment no longer logs Kubernetes errors by default - #2988
  • Logging level in Cloud now defaults to INFO - #3179

Contributors

prefect - Pandas don't even eat cereal

Published by jcrist about 4 years ago

Changelog

0.13.2

Released on August 11, 2020.

Features

Enhancements

  • Agents set flow run execution command based on flow's core version - #3113
  • Clean up extra labels on jobs created by Kubernetes agent - #3129

Task Library

  • Return LoadJob object in BigQueryLoad tasks - #3086

Fixes

  • Fix bug with LocalDaskExecutor('processes') that allowed tasks to be run multiple times in certain cases - #3127
  • Add toggle to bypass bug in slack_notifier that attempted to connect to backend even if the backend didn't exist - #3136

Contributors

prefect - Fix Default Agent Execute Command

Published by joshmeek about 4 years ago

Changelog

0.13.1

Released on August 6, 2020.

Fixes

  • Fix issue with 0.13.0 agents not able to run Flows registered with older Core versions - #3111
prefect - We should take Prefect Server... and push it somewhere else!

Published by joshmeek about 4 years ago

Changelog

0.13.0

Released on August 6, 2020.

Features

  • Support cancellation of active flow runs - #2942
  • Add Webhook storage - #3000

Enhancements

  • Only supply versions when setting SUBMITTED and RUNNING states - #2730
  • Gracefully recover from version lock errors - #2731
  • Add --ui-version server start CLI option to run a specific UI image - #3087
  • Agent querying of flow runs now passes active tenant ID - #3087
  • Ignore calls to flow.register when parsing a flow using file based storage - #3051

Task Library

  • Allow idempotency keys in FlowRunTask when using server backend - #3006
  • Require project name in FlowRunTask when using server backend - #3006

Fixes

  • Fix use of absolute path in Docker storage on Windows - #3044
  • Determine if checkpointing is enabled from config set in the flow-runner process - #3085
  • Fix --no-ui server start CLI option still attempting to pull UI image - #3087

Deprecations

  • Deprecate execute cloud-flow CLI command in favor of execute flow-run - #3087
  • Deprecate run server/cloud CLI commands in favor of run flow - #3087

Breaking Changes

  • Move server and UI code out into separate repositories - #3087
  • Project names are now required when managing flows with the core server - #3087

Contributors

prefect - Got some new code for y'all

Published by cicdw about 4 years ago

Changelog

0.12.6

Released on July 28, 2020.

Features

  • Add flatten operator for unnesting and flat-maps - #2898

Enhancements

  • Add retry_on_api_error flag to client methods - #3012
  • Add reg_allow_list option for Docker Agent - #3026
  • Update FargateTaskEnvironment to throw if task definition is inconsistent with existing task definition - #3031

Fixes

  • Cleanup to ShellTask to close open stdout file which was observable in some cases - #3002
  • Fix check of flow existence in storage object get_flow to only occur when provided - #3027
  • Use full name and tag when Docker Storage determines if build was successful - #3029
  • Prevent duplicated agent labels - #3029

Deprecations

  • prefect.utilities.tasks.unmapped moved to prefect.utilities.edges.unmapped - #2898

Breaking Changes

  • Remove dbt extra from dependencies - #3018

Contributors

prefect - Go with the FlowRunTask

Published by lauralorenz about 4 years ago

Changelog

Released on July 21, 2020.

Features

  • Add resource_manager api for cleaner setup/cleanup of temporary resources - #2913

Enhancements

  • Add new_flow_context to FlowRunTask for configurable context - #2941
  • All storage types now support file-based storage - #2944
  • Turn work stealing ON by default on Dask K8s environment - #2973
  • Send regular heartbeats while waiting to retry / dequeue - #2977
  • Cached states now validate based on hashed_inputs for more efficient storage - #2984
  • Simplify creation of optional parameters with default of None - #2995

Task Library

  • Implement AWSSecretsManager task - #2069
  • Update return value and config for DbtShellTask - #2980

Fixes

  • Don't send idempotency key when running against a local backend - #3001
  • Fix bug in DaskExecutor when running with external cluster where dask clients could potentially be leaked - #3009

Deprecations

  • All states have deprecated the usage of cached_inputs - #2984

Breaking Changes

  • Remove password from Postgres tasks' initialization methods for security - #1345

Contributors

prefect - "See namespaced job run, RunNamespacedJob run"

Published by jcrist over 4 years ago

Changelog

0.12.4

Released on July 14, 2020.

Enhancements

  • Improve output formatting of prefect describe CLI - #2934
  • Add new wait kwarg to Flow Run Task for reflecting the flow run state in the task - #2935
  • Separate build-time and run-time job spec details in KubernetesJobEnvironment - #2950

Task Library

  • Implement RunNamespacedJob task for Kubernetes - #2916
  • Add log_stderr option to ShellTask and DbtShellTask for logging the full output from stderr - #2961

Fixes

  • Ensure is_serializable always uses same executable for subprocess. - #1262
  • Fix issue with Mapped tasks not always reloading child state results on reruns - #2656
  • Fix FargateTaskEnvironment attempting to retrieve authorization token when not present - #2940
  • Fix issue with Metastates compounding - #2965

Contributors