An orchestration platform for the development, production, and observation of data assets.
APACHE-2.0 License
Bot releases are hidden (Show)
Published by elementl-devtools about 1 year ago
Published by elementl-devtools about 1 year ago
Published by elementl-devtools about 1 year ago
@partitioned_config
decorator has been added for defined configuration for partitioned jobs. Thanks @danielgafni!ConfigurablePickledObjectS3IOManager
has been renamed S3PickleIOManager
for simplicity. The ConfigurablePickledObjecS3IOManager
will continue to be available but is considered deprecated in favor of S3PickleIOManager
. There is no change in the functionality of the I/O manager.ConfigurablePickledObjectADLS2IOManager
has been renamed ADLS2PickleIOManager
for simplicity. The ConfigurablePickledObjectADLS2IOManager
will continue to be available but is considered deprecated in favor of ADLS2PickleIOManager
. There is no change in the functionality of the I/O manager.DbtCliResource
, the exception message now includes a link to the dbt.log
produced. This log file can be inspected for debugging.ConfigurablePickledObjectGCSIOManager
has been renamed GCSPickleIOManager
for simplicity. The ConfigurablePickledObjecGCSIOManager
will continue to be available but is considered deprecated in favor of GCSPickleIOManager
. There is no change in the functionality of the I/O manager.DagsterInvariantViolationError
when executing a multi-asset where both assets have self-dependencies on earlier partitions.--debug
flag raised an exception in the Dagster framework.stderr
which aligns with Python’s logging defaults.BackfillPolicy
, thanks @ruizh22!timeout
argument when constructing a DagsterGraphQLClient
.DbtCliResource
and DbtCliResource.cli(...)
.InputContext
and OutputContext
have been fixed. Thanks @Sergey Mezentsev!Published by elementl-devtools about 1 year ago
Added a respect_materialization_data_versions
option to auto materialization. It can enabled in dagster.yaml
with
auto_materialize:
respect_materialization_data_versions: True
This flag may be changed or removed in the near future.
Published by elementl-devtools about 1 year ago
None
/ Nothing
will interpret an explicitly or implicitly returned value None
to indicate that all outputs were successful.skip_reason
argument to the constructor of SensorResult
now accepts a string in addition to a SkipReason
.step_k8s_config
field to k8s_job_executor
that allows you to customize the raw Kubernetes config for each step in a job. See the docs for more information.dagster-dbt project scaffold
now creates the scaffold in multiple files:
constants.py
contains a reference to your manifest and dbt project directoryassets.py
contains your initial dbt assets definitionsdefinitions.py
contains the code to load your asset definitions into the Dagster UIschedules.py
contains an optional schedule to add for your dbt assetsget_auto_materialize_policy
and get_freshness_policy
to DagsterDbtTranslator
.load_assets_from_fivetran_instance
.SSHResource
would warn when allow_host_key_change
was set. Now known hosts are always loaded from the system hosts file, and the allow_host_key_change
parameter is ignored.@graph_multi_asset
now has an API docs entry.GCSComputeLogManager
example in the Dagster Instance reference is now correct.@dbt_assets
for the following use-cases:
--vars
Published by elementl-devtools about 1 year ago
@graph_asset
now takes a config
parameter equivalent to the parameter on @graph
.dynamic_partitions_store
argument to DynamicPartitionsDefinition
for multi-partition run properly with dynamic partitions (Thanks @elzzz!).partitionsByAssets`` to
backfillParams`` for ranged partition backfill (Thanks @ruizh22!).dbt-core==1.6
has been added.DbtCliResource
now supports configuring profiles_dir
.restart_policy
on k8s_job_op
(Thanks @Taadas!).authenticator
to SnowflakePandasIOManager
, which allows specifying the authentication mechanism to use (Thanks @pengw0048!).AutoMaterializePolicy
with assets that had at least one source asset parent and at least one non-source asset parent. This has been fixed.AutoMaterializePolicy
to a time-partitioned asset downstream of an unpartitioned asset, the latest partition would only ever be materialized a single time, rather than updating in response to any parent updates. This has been fixed.StaticPartitionsDefinition
containing many thousands of partitions could take a significant amount of time.Unexpected exception
error would be raised when scaffolding a pull request on a repository with no profiles.yml
. This behavior has been updated to raise a more descriptive error message on the repo selection page.agentReplicas
config setting on the helm chart has been renamed to isolatedAgents
. In order to use this config setting, your user code dagster version needs to be 1.4.3
or greater.Published by elementl-devtools about 1 year ago
dagster-dbt project scaffold
didn’t create a project directory with all the scaffolded files.SpecificPartitionsPartitionMapping
with auto-materialization.max_materializations_per_minute
on an AutoMaterializePolicy
to a non-positive number. This will now result in an error.upath_io_manager
from @harrylojames; thank you!dagster project scaffold
and the new dagster-dbt APIs.Published by elementl-devtools about 1 year ago
dagster-dbt project scaffold
on a dbt project directory, if a profiles.yml
exists in the root of the directory, its contents are used to add dbt adapter packages to the scaffolded setup.py
.max_concurrent
field has been changed from 0
to None
to more clearly signal its intent. A value of 0
is still interpreted as the sentinel value which dynamically allocates max_concurrent
based on detected CPU count.Definitions
, so that jobs are able to override the IO manager used.EnvVars
in a FivetranResource
would not be evaluated when loading assets from the Fivetran instance.EnvVars
in an AirbyteResource
would not be evaluated when loading assets from the Airbyte resource.DbtCliResource
, DbtCliInvocation
, @dbt_assets
, DagsterDbtTranslator
, dagster-dbt project scaffold
Published by elementl-devtools about 1 year ago
dagster-dbt
that was preventing it from correctly materializing subselections of dbt asset.Published by elementl-devtools about 1 year ago
dagster-dbt
that was preventing it efficiently loading dbt projects from a manifest.Published by elementl-devtools about 1 year ago
AutoMaterializePolicy
. It’s located under Assets
→ Select an asset with an AutoMaterializePolicy
→ Auto-materialize history
tab.non_argument_deps
parameter of @asset
and @multi_asset
in favor of a new deps
parameter. The new parameter makes it clear that this is a first-class way of defining dependencies, makes code more concise, and accepts AssetsDefinition
and SourceAsset
objects, in addition to the str
s and AssetKey
s that the previous parameter accepted.DynamicPartitionsDefinition
and SensorResult
are no longer marked experimental.@observable_source_asset
decorator now accepts an auto_observe_interval_minutes
parameter. If the asset daemon is turned on, then the observation function will automatically be run at this interval. Downstream assets with eager auto-materialize policies will automatically run if the observation function indicates that the source asset has changed. [docs]dagit
package is deprecated in favor of the dagster-webserver
package.@dbt_assets
decorator allows much more control over how Dagster runs your dbt project. [docs]dagster-dbt project scaffold
command line interface makes it easy to create files and directories for a Dagster project that wraps an existing dbt project.get_asset_key_for_model
and get_asset_key_for_source
utilities make it easy to specify dependencies between upstream dbt assets and downstream non-dbt assets. And you can now more easily specify dependencies between dbt models and upstream non-dbt assets by specifying Dagster asset keys in the dbt metadata for dbt sources.non_argument_deps
parameter of @asset
and @multi_asset
in favor of a new deps
parameter. The new parameter makes it clear that this is a first-class way of defining dependencies, makes code more concise, and accepts AssetsDefinition
and SourceAsset
objects, in addition to the str
s and AssetKey
s that the previous parameter accepted.UPathIOManager
can now be extended to load multiple partitions asynchronously (Thanks Daniel Gafni!).dagster-user-deployments.deployments.[...].readinessProbe.enabled=false
.non_argument_deps
in favor of deps
, build_airbyte_assets
now accepts a deps
parameter.non_argument_deps
in favor of deps
, define_dagstermill_asset
now accepts a deps
parameter.StaticPartitionsDefinition
will now raise an error.AutoMaterializePolicy
's to not materialize missing assets.--path-prefix
.build_asset_reconciliation_sensor
(Experimental) has been removed. It was deprecated in 1.3 in favor of AutoMaterializePolicy
.asset_key(s)
properties on AssetIn
and AssetDefinition
have been removed in favor of key(s)
. These APIs were deprecated in 1.0.root_input_manager
and RootInputManagerDefinition
have been removed in favor of input_manager
and InputManagerDefinition
. These APIs were deprecated in 1.0.event_metadata_fn
parameter on create_dagster_pandas_dataframe_type
has been removed in favor of metadata_fn
.@dbt_assets
and DbtCliResource
. See the migration guide for details.
dbt-rpc
has been removed.DbtCloudResourceV2
has been removed.DbtCli
has been renamed to DbtCliResource
. Previously, DbtCliResource
was a class alias for DbtCliClientResource
.load_assets_from_dbt_project
and load_assets_from_dbt_manifest
now default to use_build=True
.load_assets_from_dbt_project
and load_assets_from_dbt_manifest
has changed. Rather than assigning a group name using the model’s subdirectory, a group name will be assigned using the dbt model’s dbt group.node_info_to_definition_metadata_fn
for load_assets_from_dbt_project
and load_assets_from_dbt_manifest
now overrides metadata instead of adding to it.load_assets_from_dbt_project
and load_assets_from_dbt_manifest
now must be specified using keyword arguments.DbtCliResource
with load_assets_from_dbt_project
and load_assets_from_dbt_manifest
, stdout logs from the dbt process will now appear in the compute logs instead of the event logs.dagit
python package is deprecated and will be removed in 2.0 in favor of dagster-webserver
. See the migration guide for details.dagit
→ dagsterWebserver
ingress.dagit
→ ingress.dagsterWebserver
ingress.readOnlyDagit
→ ingress.readOnlyDagsterWebserver
tag:DescribeResources
. Without this policy, the ECS Agent will log a deprecation warning and fall back to its old behavior (listing all ECS services in the cluster and then listing each service's tags).DbtCliClientResource
, dbt_cli_resource
and DbtCliOutput
are now being deprecated in favor of DbtCliResource
.load_assets_from_dbt_project
and load_assets_from_dbt_manifest
are now deprecated in favor of other options. See the migration for details.Published by elementl-devtools over 1 year ago
DynamicPartitionsDefinition
and SensorResult
are no longer marked experimentalDagsterInstance
now has a get_status_by_partition
method, which returns the status of each partition for a given asset. Thanks renzhe-brian!DagsterInstance
now has a get_latest_materialization_code_versions
method, which returns the code version of the latest materialization for each of the provided (non-partitioned) assets.build_asset_context
has been added as an asset focused replacement for build_op_context
.build_op_context
now accepts a partition_key_range
parameter.AssetSelection.upstream_source_assets
method allows selecting source assets upstream of the current selection.AssetSelection.key_prefixes
and AssetSelection.groups
now accept an optional include_sources
parameter.DbtCli
resource is no longer marked experimental.global_config
parameter of the DbtCli
resource has been renamed to global_config_flags
load_assets_from_dbt_project
and load_assets_from_dbt_manifest
now work with the DbtCli
resource.manifest
argument of the @dbt_assets
decorator now additionally can accept a Path
argument representing a path to the manifest file or dictionary argument representing the raw manifest blob.DbtCli.cli
from inside a @dbt_assets
-decorated function, you no longer need to supply the manifest argument as long as you provide the context argument.DbtManifest
object can now generate schedules using dbt selection syntax.dbt_manifest.build_schedule(
job_name="materialize_dbt_models",
cron_schedule="0 0 * * *",
dbt_select="fqn:*"
)
DbtCli.cli
and the underlying command fails, an exception will now be raised. To suppress the exception, run the DbtCli.cli(..., raise_on_error=False
).OutputContext
when using with_attributes
or AssetsDefinition.from_graph
.max_materializations_per_minute
parameter, those older partitions would not be properly discarded from consideration on subsequent ticks. This has been fixed.from __future__ import annotations
) would cause errors in most cases when used with Dagster definitions. This has been fixed for the vast majority of cases.AssetExecutionContext
has returned to being a type alias for OpExecutionContext
.time_window_partition_scope_minutes
parameter of the AutoMaterializePolicy
class has been removed. Instead, max_materializations_per_minute
should be used to limit the number of runs that may be kicked off for a partitioned asset.DbtCliResource
has been deprecated in favor of DbtCli
.dagit
has been deprecated in favor of a new package dagster-webserver
.OpExecutionContext.asset_partition_key_range
has been deprecated in favor of partition_key_range
.databricks_pyspark_step_launcher
will no longer error when executing steps that target a single partition of a DynamicPartitionsDefinition
(thanks @weberdavid!).@observable_source_asset
decorator now accepts an auto_observe_interval_minutes
parameter. If the asset daemon is turned on, then the observation function will automatically be run at this interval.DbtCliTask
has been renamed to DbtCliInvocation
get_asset_key_by_output_name
and get_node_info_by_output_name
methods of DbtManifest
have been renamed toget_asset_key_for_output_name
and get_node_info_for_output_name
, respectively.DagsterInstance
, *MetadataValue
, DagsterType
, and others.dagster-pandera
now has an API docs page.Published by elementl-devtools over 1 year ago
dagster project from-example
that was preventing it from downloading examples correctly.Published by elementl-devtools over 1 year ago
--name
argument is now optional when running dagster project from-example
.@asset(key=...)
.AssetKey
now has a with_prefix
method.AutoMaterializePolicy
s with large numbers of partitions.dagster instance migrate
now prints information about changes to the instance database schema.dagster-cloud-agent
helm chart now supports setting K8s labels on the agent deployment.[ui] Fixed an issue that prevented filtering by date on the job-specific runs tab.
[ui] “F” key with modifiers (alt, ctrl, cmd, shift) no longer toggles the filter menu on pages that support filtering.
[ui] Fix empty states on Runs table view for individual jobs, to provide links to materialize an asset or launch a run for the specific job, instead of linking to global pages.
[ui] When a run is launched from the Launchpad editor while an editor hint popover is open, the popover remained on the page even after navigation. This has been fixed.
[ui] Fixed an issue where clicking on the zoom controls on a DAG view would close the right detail panel for selected nodes.
[ui] Fixed an issue shift-selecting assets with multi-component asset keys.
[ui] Fixed an issue with the truncation of the asset stale causes popover.
When using a TimeWindowPartitionMapping
with a start_offset
or end_offset
specified, requesting the downstream partitions of a given upstream partition would yield incorrect results. This has been fixed.
When using AutoMaterializePolicy
s with observable source assets, in rare cases, a second run could be launched in response to the same version being observed twice. This has been fixed.
When passing in hook_defs
to define_asset_job
, if any of those hooks had required resource keys, a missing resource error would surface when the hook was executed. This has been fixed.
Fixed a typo in a documentation URL in dagster-duckdb-polars
tests. The URL now works correctly.
DbtManifest
to fetch asset keys of sources and models: DbtManifest.get_asset_key_for_model
, DbtManifest.get_asset_key_for_source
. These methods are utilities for defining python assets as dependencies of dbt assets via @asset(key=manifest.get_asset_key_for_model(...)
.state_path
parameter with DbtManifestAssetSelection
has been deprecated, and will be removed in the next minor release.grpcio
package (for dagster
) has been removed.PartitionMapping
have been removed. Defining custom partition mappings has been unsupported since 1.1.7.build_airbyte_assets
. Thanks @guy-rvvup!Published by elementl-devtools over 1 year ago
meta.dagster.asset_key
. This field takes in a list of strings that are used as the components of the generated AssetKey
.version: 2
models:
- name: users
config:
meta:
dagster:
asset_key: ["my", "custom", "asset_key"]
meta.dagster.group
. This field takes in a string that is used as the Dagster group for the generated software-defined asset corresponding to the dbt model.version: 2
models:
- name: users
config:
meta:
dagster:
group: "my_group"
dagster-msteams
and dagster-mlflow
packages could be installed with incompatible versions of the dagster
package due to a missing pin.dagster-daemon run
command sometimes kept code server subprocesses open longer than it needed to, making the process use more memory.@observable_source_asset
s with AutoMaterializePolicies, it was possible for downstream assets to get “stuck”, not getting materialized when other upstream assets changed, or for multiple down materializations to be kicked off in response to the same version being observed multiple times. This has been fixed.@dbt_assets
project_dir
and target_path
in DbtCliTask
are converted from type str
to type pathlib.Path
.stdout
.Published by gibsondan over 1 year ago
dagster
field under +meta
configuration. The following are equivalent:Before:
version: 2
models:
- name: users
config:
dagster_freshness_policy:
maximum_lag_minutes: 60
cron_schedule: '0 9 * * *'
dagster_auto_materialize_policy:
type: 'lazy'
After:
version: 2
models:
- name: users
config:
meta:
dagster:
freshness_policy:
maximum_lag_minutes: 60
cron_schedule: '0 9 * * *'
auto_materialize_policy:
type: 'lazy'
Added support for Pythonic Config classes to the @configured
API, which makes reusing op and asset definitions easier:
class GreetingConfig(Config):
message: str
@op
def greeting_op(config: GreetingConfig):
print(config.message)
class HelloConfig(Config):
name: str
@configured(greeting_op)
def hello_op(config: HelloConfig):
return GreetingConfig(message=f"Hello, {config.name}!")
Added AssetExecutionContext
to replace OpExecutionContext
as the context object passed in to @asset
functions.
TimeWindowPartitionMapping
now contains an allow_nonexistent_upstream_partitions
argument that, when set to True
, allows a downstream partition subset to have nonexistent upstream parents.
Unpinned the alembic
dependency in the dagster
package.
[ui] A new “Assets” tab is available from the Overview page.
[ui] The Backfills table now includes links to the assets that were targeted by the backfill.
croniter==1.4.0
. Users of earlier versions of Dagster can pin croniter<1.4
.@run_failure_sensor
.py.typed
was missing in the dagster-graphql
package. Thanks @Tanguy-LeFloch!AutoMaterializePolicy
s will now be cleared after 1 week.@dbt_assets
:
profile
and target
can now be customized on the DbtCli
resource.partial_parse.msgpack
is detected in the target directory of your dbt project, it is now copied into the target directories created by DbtCli
to take advantage of partial parsing.@dbt_assets
can now be customized by overriding DbtManifest.node_info_to_metadata
.AssetMaterialization
s.serverK8sConfig.containerConfig.name
did not actually change the container name.Published by elementl-devtools over 1 year ago
1.3.8
release where the Dagster Cloud agent would sometimes fail to start up with an import error.Published by elementl-devtools over 1 year ago
define_asset_job
now accepts a hooks
argument.sqlalchemy==2.x
additionalInstanceConfig
key that allows you to supply additional configuration to the Dagster instance.EcsRunLauncher
now uses a different task definition family for each job, instead of registering a new task definition revision each time a different job is launched.EcsRunLauncher
now includes a run_ecs_tags
config key that lets you configure tags on the launched ECS task for each run.SkipReason
, the SkipReason
would be ignored. This has been fixed.dagster code-server start
CLI, instead of only reloading your code when you initiate a reload from the Dagster UI.metadata
parameter to define_asset_job
(Thanks Elliot2718!)
poll_sync_run
to handle the “preparing” status from the Census API (Thanks ldnicolasmay!)
@observable_source_asset
-decorated functions can now return a DataVersionsByPartition
to record versions for partitions.@dbt_assets
DbtCliTask
's created by invoking DbtCli.cli(...)
now have a method .is_successful()
, which returns a boolean representing whether the underlying CLI process executed the dbt command successfully.@dbt_assets
can now be customized by overriding DbtManifest.node_info_to_description
.@dbt_assets
.server_ecs_tags
and run_ecs_tags
that apply to each service or task created by the agent. See the docs for more information.instance.get_run_partition_data
in Dagster Cloud.Published by elementl-devtools over 1 year ago
.env
file in the working directory when running dagster dev
can now be used for Dagster system variables like DAGSTER_HOME
or environment variables referenced in your dagster.yaml
file using an env:
key. Previously, setting a .env
file only worked for environment variables referenced in your Dagster code.submit_job_execution
can now take in a RunConfig
object. Previously, it could only take a Python dictionary with the run configuration.jobs
argument to Definitions
./instance/runs
path instead of the new /runs
.create_databricks_run_now_op
, thanks @srggrs!Published by elementl-devtools over 1 year ago
dagster code-server start
command that can be used to launch a code server, much like dagster api grpc
. Unlike dagster api grpc
, however, dagster code-server start
runs the code in a subprocess, so it can reload code from the Dagster UI without needing to restart the command. This can be useful for jobs that load code from some external source and may want to reload job definitions without restarting the process.sensors.num_submit_workers
key to dagster.yaml
that can be used to decrease latency when a sensor emits multiple run requests within a single tick. See the docs for more information.k8s_job_executor
can now be used to launch each step of a job in its own Kubernetes, pod, even if the Dagster deployment is not using the K8sRunLauncher
to launch each run in its own Kubernetes pod.io_manager_def
param on an asset.maxResumeRunAttempts
to null in the helm chart would cause it to be set to a default value of 3 instead of disabling run retries.k8s_job_executor
would sometimes fail with a 409 Conflict error after retrying the creation of a Kubernetes pod for a step, due to the job having already been created during a previous attempt despite raising an error.op_name
was passed to load_assets_from_dbt_manifest
, and a select
parameter was specified, a suffix would be appended to the desired op name.dagit
would lead to JavaScript bundle loading errors.DbtCli
, in dagster_dbt.cli
. This new resource only support dbt-core>=1.4.0
.@dbt_assets
in dagster_dbt.asset_decorator
that allows you to specify a compute function for a selected set of dbt assets that loaded as an AssetsDefinition
.