An orchestration platform for the development, production, and observation of data assets.
APACHE-2.0 License
Bot releases are hidden (Show)
Published by elementl-devtools about 1 month ago
CapturedLogManager
and ComputeLogManager
APIs into a single base class.PipesECSClient
to allow Dagster to interface with ECS tasks.dbt
subprocess to wait 25 seconds for the subprocess to cleanly terminate. Previously, it would only wait 2 seconds.sdf
subprocess to wait 25 seconds for the subprocess to cleanly terminate. Previously, it would only wait 2 seconds.AutomationCondition
using DagsterDltTranslator.get_automation_condition()
(Thanks, aksestok!)dagsterDaemon.runRetries.retryOnAssetOrOpFailure
to False in the Dagster Helm chart to prevent op retries and run retries from simultaneously firing on the same failure.
recursive
parameter (Thanks, chrishiste!)dagster/image
label when image is provided from user_defined_k8s_config
. (Thanks, @HynekBlaha!)custom_user_agent
, was provided by defaultPipesGlueClient.run
have been deprecated and will be removed in 1.9.0
. The new params
argument should be used instead.Published by elementl-devtools about 2 months ago
allow_missing_partitions
configuration option.Published by elementl-devtools about 2 months ago
PartitionsDefinition
s, there will no longer be an implicit asset job __ASSET_JOB_...
for each PartitionsDefinition
; there will just be one with all the assets. This reduces the time it takes to load code locations with assets with many different PartitionsDefinition
s.Published by elementl-devtools about 2 months ago
OpenAIResource
(Thanks, @chasleslr!)MultiPartitionMapping
with @dbt_assets
(Thanks, @arookieds!)AzureBlobComputeLogManager
without a secret_key
(Thanks, @ion-elgreco and @HynekBlaha!)AutomationCondition
and associated static constructors.PipesCloudWatchMessageReader
Published by elementl-devtools 2 months ago
build_op_context
and build_asset_context
now accepts a run_tags
argument.Definitions
.RUN_CANCELED
events now display relevant error messages.PipesCloudWatchMessageReader
can consume logs from CloudWatch as pipes messages.AzureBlobComputeLogManager
now supports service principals, thanks @ion-elgreco!dagster-databricks
now supports databricks-sdk<=0.17.0
.dagster-datahub
now allows pydantic versions below 3.0.0, thanks @kevin-longe-unmind!DagsterDbtTranslator
class now supports a modfiying the AutomationCondition
for dbt models by overriding get_automation_condition
.dagster-pandera
now supports polars
.replicate(...).fetch_column_metadata()
method.OpenAIResource
now supports organization
, project
and base_url
for configurting the OpenAI client, thanks @chasleslr!numpy<2
, thanks @judahrand!AirbyteCloudResource
now supports client_id
and client_secret
for authentication - the api_key
approach is no longer supported. This is motivated by the deprecation of portal.airbyte.com on August 15, 2024.databricks-cli
and databricks_api
SlingSourceConnection
, SlingTargetConnection
SlingSourceConnection
, SlingTargetConnection
build_sling_assets
, and sync
Published by elementl-devtools 2 months ago
AssetSpec
objects to the assets
argument of Definitions
, to let Dagster know about assets without associated materialization functions. This replaces the experimental external_assets_from_specs
API, as well as SourceAsset
s, which are now deprecated. Unlike SourceAsset
s, AssetSpec
s can be used for non-materializable assets with dependencies on Dagster assets, such as BI dashboards that live downstream of warehouse tables that are orchestrated by Dagster. [docs].Definitions
objects together into a single larger Definitions
object, using the new Definitions.merge
API (doc). This makes it easier to structure large Dagster projects, as you can construct a Definitions
object for each sub-domain and then merge them together at the top level.BackfillPolicy
s assigned to assets are now respected for backfills launched from jobs that target those assets.AutomationCondition
s to your assets to have them automatically executed in response to specific conditions (docs). These serve as a drop-in replacement and improvement over the AutoMaterializePolicy
system, which is being marked as deprecated.target
parameter, instead of needing to construct a job.Experimental navigation
feature flag in user settings.build_metadata_bounds_checks
API [doc] enables easily defining asset checks that fail if a numeric asset metadata value falls outside given bounds.PipesSubprocessClient
) and its integrations with Lambda (PipesLambdaClient
), Kubernetes (PipesK8sClient
), and Databricks (PipesDatabricksClient
) are no longer experimental.DbtProject
class (docs) makes it simpler to define dbt assets that can be constructed in both development and production. DbtProject.prepare_if_dev()
eliminates boilerplate for local development, and the dagster-dbt project prepare-and-package
CLI can helps pull deps and generate the manifest at build time.dagster-looker
package can be used to define a set of Dagster assets from a Looker project that is defined in LookML and is backed by git. See the GitHub discussion for more details.The target of both schedules and sensors can now be set using an experimental target
parameter that accepts an AssetSelection
or list of assets. Any assets passed this way will also be included automatically in the assets
list of the containing Definitions
object.
ScheduleDefinition
and SensorDefinition
now have a target
argument that can accept an AssetSelection
.
You can now wipe materializations for individual asset partitions.
AssetSpec
now has a partitions_def
attribute. All the AssetSpec
s provided to a @multi_asset
must have the same partitions_def
.
The assets
argument on materialize
now accepts AssetSpec
s.
The assets
argument on Definitions
now accepts AssetSpec
s.
The new merge
method on Definitions
enables combining multiple Definitions
object into a single larger Definition
s object with their combined contents.
Runs requested through the Declarative Automation system now have a dagster/from_automation_condition: true
tag applied to them.
Changed the run tags query to be more performant. Thanks @egordm!
Dagster Pipes and its integrations with Lambda, Kubernetes, and Databricks are no longer experimental.
The Definitions
constructor will no longer raise errors when the provided definitions aren’t mutually resolve-able – e.g. when there are conflicting definitions with the same name, unsatisfied resource dependencies, etc. These errors will still be raised at code location load time. The new Definitions.validate_loadable
static method also allows performing the validation steps that used to occur in constructor.
AssetsDefinitions
object provided to a Definitions
object will now be deduped by reference equality. That is, the following will now work:
from dagster import asset, Definitions
@asset
def my_asset(): ...
defs = Definitions(assets=[my_asset, my_asset]) # Deduped into just one AssetsDefinition.
[dagster-embedded-elt] Adds translator options for dlt integration to override auto materialize policy, group name, owners, and tags
[dagster-sdf] Introducing the dagster-sdf integration for data modeling and transformations powered by sdf.
[dagster-dbt] Added a new with_insights()
method which can be used to more easily attach Dagster+ Insights metrics to dbt executions: dbt.cli(...).stream().with_insights()
build_asset_with_blocking_check
has been removed. Use the blocking
argument on @asset_check
instead.mypy
and pydantic
1 may now experience a “metaclass conflict” error when using Config
. Previously this would occur when using pydantic 2.AutoMaterializeSensorDefinition
has been renamed AutomationConditionSensorDefinition
.ComputeLogManager
have been removed. Custom ComputeLogManager
implementations must also implement the CapturedLogManager
interface. This will not affect any of the core implementations available in the core dagster
package or the library packages.AutomationConditionSensorDefinition
with the name “default_automation_condition_sensor”
will be constructed for each code location, and will handle evaluating and launching runs for all AutomationConditions
and AutoMaterializePolicies
within that code location. You can restore the previous behavior by setting:
auto_materialize:
use_sensors: False
in your dagster.yaml file.dbt-core==1.6.*
has been removed because the version is now end-of-life.KeyPrefixDagsterDbtTranslator
has been removed. To modify the asset keys for a set of dbt assets, implementDagsterDbtTranslator.get_asset_key()
instead.+meta.dagster_freshness_policy
has been removed. Use +meta.dagster.freshness_policy
instead.+meta.dagster_auto_materialize_policy
has been removed. Use +meta.dagster.auto_materialize_policy
instead.load_assets_from_dbt_project
, load_assets_from_dbt_manifest
, and dbt_cli_resource
has been removed. Use @dbt_assets
, DbtCliResource
, and DbtProject
instead to define how to load dbt assets from a dbt project and to execute them.dbt_run_op
, dbt_compile_op
, etc has been removed. Use @op
and DbtCliResource
directly to execute dbt commands in an op.AssetExecutionContext
, OpExecutionContext
, and ScheduleExecutionContext
that include datetime
s now return standard Python datetime
objects instead of Pendulum datetimes. The types in the public API for these properties have always been datetime
and this change should not be breaking in the majority of cases, but Pendulum datetimes include some additional methods that are not present on standard Python datetime
s, and any code that was using those methods will need to be updated to either no longer use those methods or transform the datetime
into a Pendulum datetime. See the 1.8 migration guide for more information and examples.MemoizableIOManager
, VersionStrategy
, SourceHashVersionStrategy
, OpVersionContext
, ResourceVersionContext
, and MEMOIZED_RUN_TAG
, which have been deprecated and experimental since pre-1.0, have been removed.external_assets_from_specs
API has been deprecated. Instead, you can directly pass AssetSpec
objects to the assets
argument of the Definitions
constructor.AutoMaterializePolicy
has been marked as deprecated in favor of AutomationCondition
, which provides a significantly more flexible and customizable interface for expressing when an asset should be executed. More details on how to migrate your AutoMaterializePolicies
can be found in the Migration Guide.SourceAsset
has been deprecated. See the major changes section and migration guide for more details.asset_partition_key_for_output
, asset_partition_keys_for_output
, and asset_partition_key_range_for_output
, and asset_partitions_time_window_for_output
methods on OpExecutionContext
have been deprecated. Instead, use the corresponding property: partition_key
, partition_keys
, partition_key_range
, or partition_time_window
.partitions_def
parameter on define_asset_job
is now deprecated. The partitions_def
for an asset job is determined from the partitions_def
attributes on the assets it targets, so this parameter is redundant.create_shell_command_op
and create_shell_script_op
have been marked as deprecated in favor of PipesSubprocessClient
(see details in Dagster Pipes subprocess reference)load_assets_from_airbyte_project
is now deprecated, because the Octavia CLI that it relies on is an experimental feature that is no longer supported. Use build_airbyte_assets
or load_assets_from_airbyte_project
instead.MonthlyPartitionsDefinition
. Thanks @zero_stroke!Published by elementl-devtools 3 months ago
dagster_aws
.Published by elementl-devtools 3 months ago
per_step_k8s_config
configuration option to the celery_k8s_job_executor
, allowing the k8s configuration of individual steps to be configured at run launch time. Thanks @alekseik1!log_column_level_metadata
macro in favor of the new with_column_metadata
API.load_assets_from_airbyte_project
as the Octavia CLI has been deprecated.RunRequest(...)
as None
EcsRunLauncher
would sometimes fail to launch runs when the include_sidecars
option was set to True
.total_byte_billed
or total_slot_ms
in the BigQuery INFORMATION_SCHEMA.JOBS
table.Published by elementl-devtools 3 months ago
ShellCommandBlueprint
, you can now use slashes as a delimiter to generate an AssetKey
with multiple path components.mlflow_run_id
attribute (Thanks Joe Percivall!)dagster dev
was logging unexpectedly without the grpcio<1.65.0
pin.ContextVar was created in a different context
error was raised when executing an async asset.multi_asset
type-checker fix from @aksestok, thanks!Published by elementl-devtools 3 months ago
InputContext
passed to an IOManager
’s load_input
function when invoking the output_value
or output_for_node
methods on JobExecutionResult
now has the name "dummy_input_name"
instead of None
.DbtProject
is adopted and no longer experimental. Using DbtProject
helps achieve a setup where the dbt manifest file and dbt dependencies are available and up-to-date, during development and in production. Check out the API docs for more: https://docs.dagster.io/_apidocs/libraries/dagster-dbt#dagster_dbt.DbtProject.—use-dbt-project
flag was introduced for the cli command dagster-dbt project scaffold
. Creating a Dagster project wrapping a dbt project using that flag will include a DbtProject
.DAGSTER_UI_EVENT_LOAD_CHUNK_SIZE
environment variable on the Dagster webserver.RunFailureReason.START_TIMEOUT
run monitoring failure reason. Thanks @jobicarter!ObserveResult
objects to not be stored with the produced AssetObservation
event.metadata
defined on SourceAssets
to be unavailable when accessed in an IOManager.@graph_asset
decorator overload missing an owners
argument, thanks @askvinni!execute_k8s_job
when there was a transient failure while loading logs from the launched job. Thanks @piotrmarczydlo!dateutil
package being installed in the default EMR python evnrionment.AutoMaterializeRule.skip_on_parent_missing
rule when a parent asset had its PartitionsDefinition
changed.AutomationConditions
.AutomationCondition.newly_updated()
would trigger on any ASSET_OBSERVATION
event. Now, it only triggers when the data version on that event changes.dagster-dbt project prepare-for-deployment
has been replaced by dagster-dbt project prepare-and-package
.DbtProject
no longer prepares the dbt manifest file and dbt dependencies in its constructor during initialization. This process has been moved to prepare_if_dev()
, that can be called on the DbtProject
instance after initialization. Check out the API docs for more: https://docs.dagster.io/_apidocs/libraries/dagster-dbt#dagster_dbt.DbtProject.prepare_if_dev.GraphDefinition
as the job
argument to schedules and sensors is deprecated. Derive a job from the GraphDefinition
using graph_def.to_job()
and pass this instead.dagster-plus CLI
in the sidenav to correctly be dagster-cloud CLI
.Published by elementl-devtools 4 months ago
Published by elementl-devtools 4 months ago
AssetsDefinition
construction, enforce single key per output namedagster/last_updated_timestamp
.AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron()
rule gained a new dependency with a different PartitionsDefinition.source_key_prefix
arg of load_assets_from_modules
. (thanks @drjlin)!load_assets_from_airflow_dag
no longer allows multiple tasks to materialize the same asset.dagster-cloud ci init
CLI will now use the --deployment
argument as the base deployment when creating a branch deployment. This base deployment will be used for Change Tracking.dbt_with_bigquery_insights
now respects CLI arguments for profile configuration and also selects location / dataset from the profile when available.Published by elementl-devtools 4 months ago
build_freshness_checks_for_dbt_assets
which allows users to parameterize freshness checks entirely within dbt. Check out the API docs for more: https://docs.dagster.io/_apidocs/libraries/dagster-dbt#dbt-dagster-dbt.max_partitions_per_run
from the job’s constituent assets.asset_tags
can now be specified when building dagstermill assetsDagsterSlingTranslator
dagster/storage_kind
tags attachedtags
passed to outs
in graph_multi_asset
now get correctly propagated to the resulting assets.build_metadata_bounds_checks
now no longer errors when targeting metadata keys that have special characters.--read-only
flag to the dagster-cloud ci branch-deployment
CLI command, which returns the current branch deployment name for the current code repository branch without update the status of the branch deployment.Published by elementl-devtools 4 months ago
dagster/storage_kind
tag.dbt retry
within a try/except block to avoid unnecessary, duplicate work.AssetExecutionContext
now exposes a has_partition_key_range
property.owners
, metadata
, tags
, and deps
properties on AssetSpec
are no longer Optional
. The AssetSpec
constructor still accepts None
values, which are coerced to empty collections of the relevant type.docker_executor
and k8s_job_executor
now consider at most 1000 events at a time when loading events from the current run to determine which steps should be launched. This value can be tuned by setting the DAGSTER_EXECUTOR_POP_EVENTS_LIMIT
environment variable in the run process.dagster/retry_on_asset_or_op_failure
tag that can be added to jobs to override run retry behavior for runs of specific jobs. See the docs for more information.build_sensor_for_freshness_checks
to describe when/why it skips evaluating freshness checks.@multi_asset_sensor
.ScheduleDefinition
now properly supports being passed a RunConfig
object.MaterializeResult
, but the function has no type annotation, previously, the IO manager would still be invoked with a None
value. Now, the IO manager is not invoked.AssetSpec
constructor now raises an error if an invalid owner string is passed to it.graph_multi_asset
decorator, the code_version
property on AssetOut
s passed in used to be ignored. Now, they no longer are.dagster-cloud job launch
command did not support specifying asset keys with prefixes in the --asset-key
argument.group:
, code location:
, tag:
, owner:
.Published by elementl-devtools 5 months ago
Definitions
now has a get_all_asset_specs
method, which allows iterating over properties of the defined assetsterminate_runs
method to the Python GraphQL Client. (thanks @baumann-t!)DbtCliInvocation
now has a .get_error()
method that can be useful when using dbt.cli(..., raise_on_error=False)
.DynamicPartitionsDefinition
(using partitions_fn
) that caused a crash during job backfills.build_metadata_bounds_checks
API creates asset checks which verify that numeric metadata values on asset materializations fall within min or max values. See the documentation for more information.build_sensor_for_freshness_checks
and Dagster Plus. This API should now work when used with Dagster Plus.dagsterCloudAgent.additionalPodSpecConfig
to the Kubernetes agent Helm chart allowing arbitrary pod configuration to be applied to the agent pod.Published by elementl-devtools 5 months ago
datetime.utcfromtimestamp
(thanks @dbrtly!)dbt-core==1.8.*
.BackfillPolicy
s, each asset would get materialized in its own run, rather than grouping assets together into single run.Published by elementl-devtools 5 months ago
dagster/row_count
.CloudwatchLogsHandler
, ECRPublicClient
, SecretsManagerResource
, SSMResource
thanks @jacob-white-simplisafe
!TableMetadataValue
, TableSchemaMetadataValue
, or TableColumnLineageMetadataValue
defined.BackfillPolicy
of the underlying assets in the job.databricks-sdk
version bumped to 0.17.0
, thanks @lamalex
!dagster code-server start
, thanks @SanjaySiddharth
!@JonathanLai2004
!Published by elementl-devtools 5 months ago
MaterializeResult
, ObserveResult
, or Output
, you can now include tags that will be attached to the corresponding AssetMaterialization
or AssetObservation
event. These tags will be rendered on these events in the UI.build_last_update_freshness_checks
and build_time_partition_freshness_checks
APIs have been updated to be clearer.%
into the asset graph’s query selector no longer crashes the UI.MetadataValue
have been changed from NamedTuple
s to Pydantic models. NamedTuple
functionality on these classes was not part of Dagster’s stable public API, but usages relying on their tuple-ness may break. For example: calling json.dumps
on collections that include them.dbt-core==1.5.*
has been removed, as it has reached end of life in April 2024.dagster-cloud
CLI where the --deployment
argument was ignored when the DAGSTER_CLOUD_URL
environment variable was set.dagster-cloud-cli
package wouldn’t work unless the dagster-cloud
package was installed as well.Published by elementl-devtools 6 months ago
TimeWindowPartitionMapping
now supports the start_offset
and end_offset
parameters even when the upstream PartitionsDefinition
is different than the downstream PartitionsDefinition
. The offset is expressed in units of downstream partitions, so TimeWindowPartitionMapping(start_offset=-1)
between an hourly upstream and a daily downstream would map each downstream partition to 48 upstream partitions – those for the same and preceding day.path
metadata from UPathIOManager
inputs. This eliminates the creation of ASSET_OBSERVATION
events for every input on every step for the default I/O manager.owners
on @graph_asset
.DagsterInvalidSubsetError
when trying to launch runs.dagster-cloud branch-deployment
CLI, you can now specify the base deployment. The base deployment will be used for comparing assets for Change Tracking. For example, to set the base deployment to a deployment named staging
: dagster-cloud branch-deployment create-or-update --base-deployment-name staging ...
. Note that once a Branch Deployment is created, the base deployment cannot be changed.413: Request Entity Too Large
error when uploading a heartbeat to the Dagster Plus servers.Published by elementl-devtools 6 months ago
@graph_asset
now accepts a tags
argumentload_asset_defs_from_fivetran_instance
. Thanks @lamalex!Duplicate check specs
errors with singular tests ingested as asset checks.source.with_resources(...)