Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
APACHE-2.0 License
Bot releases are visible (Hide)
Published by cicdw over 5 years ago
Released July 16, 2019
execute cloud-flow
CLI immediately set the flow run state to Failed
if environment fails - #1122
auto_generated
property to Tasks for convenient filtering - #1135
Client.login()
with API tokens - #1240
prefect run cloud
command - #1241
marshmallow==3.0.0rc7
- #1151
flow.update
not preserving mapped edges - #1164
prefect execute-flow
and prefect execute-cloud-flow
no longer exist - #1059
slack_notifier
state handler now uses a webhook_secret
kwarg to pull the URL from a Secret - #1075
CloudResultHandler
default result handler - #1198
LocalStorage
to Local
- #1236
Published by cicdw over 5 years ago
This release fixes a versioning issue caused by a new release of marshmallow-oneofschema
that is incompatible with the marshmallow
version pin in 0.5.4.
Published by cicdw over 5 years ago
Released May 28, 2019
UnionSchedule
for combining multiple schedules, allowing for complex schedule specifications - #428
prefect_version
kwarg to Docker
storage for controlling the version of prefect installed into your containers - #1010, #533
SlackTask
to pull the Slack webhook URL from a custom named Secret - #1023
__repr__
s for various classes, to remove inconsistencies - #617
all_extras
- #1057
configuration.load_configuration()
- #1037
ValueError
s when not found in context - #1047
Published by cicdw over 5 years ago
Released May 7, 2019
storage
keyword - #936
environment
argument now defaults to a CloudEnvironment
- #936
Queued
states accept start_time
arguments - #955
Bytes
and Memory
storage classes for local testing - #956, #961
LocalEnvironment
execution environment for local testing - #957
Aborted
state for Flow runs which are cancelled by users - #959
execute-cloud-flow
CLI command for working with cloud deployed flows - #971
flows.run_on_schedule
configuration option for affecting the behavior of flow.run
- #972
manual_only
triggers to be root tasks - #667
filename
keyword to flow.visualize
for automatically saving visualizations - #1001
LocalStorage
option for storing Flows locally - #1006
run_flow
loading to decode properly by use cloudpickle - #978
flow.id
and task.id
attributes - #940
flow.deploy
and client.deploy
to use set_schedule_active
kwarg to match Cloud - #991
Flow.generate_local_task_ids()
- #992
Published by cicdw over 5 years ago
Released April 19, 2019
DaskExecutor(local_processes=True)
supports timeouts - #886
Secret.get()
from within a Flow context raises an informative error - #927
Task.set_upstream
and Task.set_downstream
for handling keyed and mapped dependencies - #823
is_submitted
to states - #944
ClientFailed
state - #938
all_failed
triggers running if an upstream Client call fails in Cloud - #938
prefect make user config
from cli commands - #904
set_schedule_active
keyword in Flow deployments to set_schedule_inactive
to match Cloud - #941
Published by cicdw over 5 years ago
Released April 4, 2019
S3ResultHandler
for handling results to / from S3 buckets - #879
Cached
states across flow runs in Cloud - #885
pytest
(4.3) - #814
Client.deploy
accepts optional build
kwarg for avoiding building Flow environment - #876
distributed
to 1.26.1 for enhanced security features - #878
ParseRSSFeed
for parsing a remote RSS feed - #856
flow.run
from properly using cached tasks - #861
flow.visualize
so that it runs on Windows machines - #858
GCSResultHandler
was not pickleable - #879
Dict
task from run(**task_results)
to run(keys, values)
- #894
Published by cicdw over 5 years ago
Released March 24, 2019
checkpoint
option for individual Task
s, as well as a global checkpoint
config setting for storing the results of Tasks using their result handlers - #649
defaults_from_attrs
decorator to easily construct Task
s whose attributes serve as defaults for Task.run
- #293
OneTimeSchedule
for one-time execution at a specified time - #680
flow.run
is now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690
prefect.context
with various formatted date strings during execution - #704
State
objects to store fully hydrated Result
objects which track information about how results should be handled - #612, #616
google.cloud.storage
as an optional extra requirement so that the GCSResultHandler
can be exposed better - #626
start_time
check for Scheduled flow runs, similar to the one for Task runs - #605
createProject
mutation function to the client - #633
Result
interface into Result
and SafeResult
- #649
manual_only
trigger will pass if resume=True
is found in context, which indicates that a Resume
state was passed - #664
defaults_from_attrs
now accepts a splatted list of arguments - #676
flow.run(on_schedule=True)
for local execution - #680
helper_fns
keyword to ShellTask
for pre-populating helper functions to commands - #681
context
from Cloud when running flows - #699
Queued
state - #705
flow.serialize()
will always serialize its environment, regardless of build
- #696
flow.deploy()
now raises an informative error if your container cannot deserialize the Flow - #711
_MetaState
as a parent class for states that modify other states - #726
flow
keyword argument to Task.set_upstream()
and Task.set_downstream()
- #749
is_retrying()
helper method to all State
objects - #753
None
- #753
CronSchedule
- #729
idempotency_key
and context
arguments to Client.create_flow_run
- #757
EmailTask
more secure by pulling credentials from secrets - #706
GCSUpload
and GCSDownload
for uploading / retrieving string data to / from Google Cloud Storage - #673
BigQueryTask
and BigQueryInsertTask
for executing queries against BigQuery tables and inserting data - #678, #685
FilterTask
for filtering out lists of results - #637
S3Download
and S3Upload
for interacting with data stored on AWS S3 - #692
AirflowTask
and AirflowTriggerDAG
tasks to the task library for running individual Airflow tasks / DAGs - #735
OpenGitHubIssue
and CreateGitHubPR
tasks for interacting with GitHub repositories - #771
GetRepoInfo
for pulling GitHub repository information - #816
Exception
s' call signature could not be inspected - #513
**kwargs
- #658
JinjaTemplate
not being pickleable - #710
json.loads
- #716
IntervalSchedules
didn't respect daylight saving time after serialization - #729
BokehRunner
and associated webapp - #609
ResultHandler
methods from serialize
/ deserialize
to write
/ read
- #612
State
objects to store fully hydrated Result
objects which track information about how results should be handled - #612, #616
Client.create_flow_run
now returns a string instead of a GraphQLResult
object to match the API of deploy
- #630
flow.deploy
and client.deploy
require a project_name
instead of an ID - #633
cached_inputs
- #591
Match
task (used inside control flow) to CompareValue
- #638
Client.graphql()
now returns a response with up to two keys (data
and errors
). Previously the data
key was automatically selected - #642
ContainerEnvironment
was changed to DockerEnvironment
- #670
from_file
was moved to utilities.environments
- #670
start_tasks
argument from FlowRunner.run()
and check_upstream
argument from TaskRunner.run()
- #672
flow.run
is now a blocking call which will run the Flow, on its schedule, and execute full state-based execution (including retries) - #690
make_return_failed_handler
as flow.run
now returns all task states - #693
AirflowTask
in the task library for running individual Airflow tasks - #735
name
is now required on all Flow objects - #732
Flow.parameters()
always returns a set of parameters - #756
Published by jlowin over 5 years ago
on_schedule
kwarg in flow.run()
- #519
ContainerEnvironment
s locally without pushing to registry - #514
cached_inputs
over upstream states, if available - #546
FlowRunner.initialize_run()
for manipulating task states and contexts - #548
on_failure
kwarg to Tasks and Flows for user-friendly failure callbacks - #551
scheduled_start_time
in context for Flow runs - #524
metadata
attribute to States for managing user-generated results - #573
JSONResultHandler
for all Parameter caching - #590
flow.deploy()
attempting to access a nonexistent string attribute - #503
Paused
tasks would be treated as Pending
and run - #535
cached_inputs
- #594
prefect.client.result_handlers
to prefect.engine.result_handlers
- #512
inputs
kwarg from TaskRunner.run()
- #546
start_task_ids
argument from FlowRunner.run()
to Environment.run()
- #544, #545
timeout
kwarg from timedelta
to integer
- #540
timeout
kwarg from executor.wait
- #569
VersionedSchema
in favor of implicit versioning: serializers will ignore unknown fields and the create_object
method is responsible for recreating missing ones - #583
CachedState
to a successful state named Cached
, and also remove the superfluous cached_result
attribute - #586
Published by jlowin almost 6 years ago
Flow
, Task
, Parameter
, Edge
, State
, Schedule
, and Environment
objects - #310, #318, #319, #340
ResultHandler
s for storing private result data - #391, #394, #430
Mapped
states - #485
TimedOut
state for task execution timeouts - #255
description
and tags
arguments to Parameters
- #318
key
checks to be skipped in order to create "dummy" flows from metadata - #319
names_only
keyword to flow.parameters
- #337
to_dict
convenience method for DotDict
class - #341
ini
file specification - #347
toml
file - #361
map_index
into the TaskRunner
s - #373
start_date
and end_date
parameters - #375
DateTime
marshmallow field for timezone-aware serialization - #378
client.deploy
method for adding new flows to the Prefect Cloud - #388
id
attribute to Task
class - #416
Resume
state for resuming from Paused
tasks - #435
Submitted
state for signaling that Scheduled
tasks have been handled - #445
ContainerEnvironment
s - #453
set_secret
method to Client for creating and setting the values of user secrets - #452
CloudTaskRunner
and CloudFlowRunner
classes - #431
engine
classes from config - #477
GraphQLResult
reprs - #374
CronSchedule
produces expected results across daylight savings time transitions - #375
utilities.serialization.Nested
properly respects marshmallow.missing
values - #398
flow.visualize()
so that mapped flow states can be passed and colored - #387
IntervalSchedule
was serialized at "second" resolution, not lower - #427
SKIP
signals were preventing multiple layers of mapping - #455
flow.visualize()
- #454
cached_inputs
weren't being used locally - #434
Config.set_nested
would have an error if the provided key was nested deeper than an existing terminal key - #479
state_handlers
were not called for certain signals - #494
NoSchedule
and DateSchedule
schedule classes - #324
serialize()
method to use schemas rather than custom dict - #318
timestamp
property from State
classes - #305
prefect.utilities.json
- #336
flow.parameters
now returns a set of parameters instead of a dictionary - #337
to_dotdict
-> as_nested_dict
- #339
prefect.utilities.collections.GraphQLResult
to prefect.utilities.graphql.GraphQLResult
- #371
SynchronousExecutor
now does not do depth first execution for mapped tasks - #373
prefect.utilities.serialization.JSONField
-> JSONCompatible
, removed its max_size
feature, and no longer automatically serialize payloads as strings - #376
prefect.utilities.serialization.NestedField
-> Nested
- #376
prefect.utilities.serialization.NestedField.dump_fn
-> NestedField.value_selection_fn
for clarity - #377
secrets
in context instead of _secrets
- #382
Schedule
parameter from on_or_after
to after
- #396
dict
keys instead of str
; some arguments for ContainerEnvironment
are removed - #398
environment.run()
and environment.build()
; removed the flows
CLI and replaced it with a top-level CLI command, prefect run
- #400
set_temporary_config
utility now accepts a single dict of multiple config values, instead of just a key/value pair, and is located in utilities.configuration
- #401
click
requirement to 7.0, which changes underscores to hyphens at CLI - #409
IntervalSchedule
rejects intervals of less than one minute - #427
FlowRunner
returns a Running
state, not a Pending
state, when flows do not finish - #433
task_contexts
argument from FlowRunner.run()
- #440
start_tasks
will not run before their state's start_time
(if the state is Scheduled
) - #474
DaskExecutor
's "processes" keyword argument was renamed "local_processes" - #477
mapped
and map_index
kwargs from TaskRunner.run()
. These values are now inferred automatically - #485
upstream_states
dictionary used by the Runners only includes State
values, not lists of States
. The use case that required lists of States
is now covered by the Mapped
state. - #485
Published by cicdw almost 6 years ago
FlowRunner
and TaskRunner
into a modular Runner
pipelines - #260, #267
state_handlers
for FlowRunners
, Flows
, TaskRunners
, and Tasks
- #264, #267
flow.get_tasks()
for easily filtering flow tasks by attribute - #242
JinjaTemplateTask
for easily rendering jinja templates - #200
PAUSE
signal for halting task execution - #246
Paused
state corresponding to PAUSE
signal, and new pause_task
utility - #251
DaskExecutor(processes=True)
- #240
is_skipped()
and is_scheduled()
methods for State
objects - #266, #278
now()
as a default start_time
for Scheduled
states - #278
Signal
classes now pass arguments to underlying State
objects - #279
Retrying
states - #281
Published by jlowin about 6 years ago
DaskExecutor
- #151, #186
tags
- #158, #186
Task.map
for mapping tasks - #186
AirFlow
utility for importing Airflow DAGs as Prefect Flows - #232
ShellTask
- #150
Task
class can now be run as a dummy task - #191
return_failed
keyword to flow.run()
for returning failed tasks - #205
flow.visualize()
for visualizing mapped tasks and coloring nodes by state - #202
flow.replace()
method for swapping out tasks within flows - #230
debug
kwarg to DaskExecutor
for optionally silencing dask logs - #209
BokehRunner
for visualizing mapped tasks - #220
map
functionality for the LocalExecutor
- #233
DotDicts
can have non-string keys - #193
edges
- #225
Published by jlowin about 6 years ago
Published by jlowin about 6 years ago
ifelse
, switch
, and merge
- #92
reference_tasks
- #95, #137
Registry
- #90
cache_validators
- #84, #107
Flow
definitionState
classesSignals
to transmit State
bind()
method for tasks to call without copying - #132
TriggerFail
state - #67