dagster

An orchestration platform for the development, production, and observation of data assets.

APACHE-2.0 License

Downloads
12.2M
Stars
11.1K
Committers
367

Bot releases are visible (Hide)

dagster - 1.3.5 (core) / 0.19.5 (libraries)

Published by elementl-devtools over 1 year ago

New

  • A new max_materializations_per_minute parameter (with a default of 1) to AutoMaterializationPolicy.eager() and AutoMaterializationPolicy.lazy() allows you to set bounds on the volume of work that may be automatically kicked off for each asset. To restore the previous behavior, you can explicitly set this limit to None.
  • DailyPartitionsDefinition, HourlyPartitionsDefinition, WeeklyPartitionsDefinition, and MonthlyPartitionsDefinition now support and end_date attribute.
  • [ui] When GraphQL requests time out with 504 errors, a toaster message is now shown indicating the error, instead of failing silently.
  • [dagster-snowflake] The Snowflake I/O managers now support authentication via unencrypted private key.

Bugfixes

  • When using AutoMaterializePolicys or build_asset_reconciliation_sensor, a single new data version from an observable source asset could trigger multiple runs of the downstream assets. This has been fixed.
  • Fixed a bug with pythonic resources where raw run config provided to a resource would be ignored.
  • We previously erroneously allowed the use of EnvVar and IntEnvVar within raw run config - although they just returned the name of the env var rather than retrieve its value. This has been fixed to error directly.
  • [ui] Fixed an issue in the left navigation where code locations with names with URI-encodable characters (e.g. whitespace) could not be expanded.
  • [ui] Fixed an issue where the time shown on the Runs page when a run was starting was shown in an incorrect timezone.
  • [dagster-dbt] Fixed an issue where selecting models by * was being interpreted as glob pattern, rather than as a dbt selector argument. We now explicitly set the default selection pattern as fqn:*.
  • [dagster-cloud cli] Fixed and issue where dagster-cloud serverless deploy did not create a unique image tag if the --image tag was not specified.

Community Contributions

  • Added an option to specify op_name on load_assets_from_dbt_project and load_assets_from_dbt_manifest (thanks @wkeifenheim!)
  • [Helm] Added support for connecting to code servers over SSL (thanks @jrouly!)

Documentation

  • New tutorial section on how to manage your own I/O and control over dependencies

Dagster Cloud

  • Added the ability to assign users to teams. A team is a group of users with a shared set of permissions. See the docs for more information.
dagster - 1.3.4 (core) / 0.19.4 (libraries)

Published by elementl-devtools over 1 year ago

New

  • Run monitoring will now detect runs that are stuck in a CANCELING state due to an error during termination and move them into CANCELED. See the docs for more information.
  • TimeWindowPartitionMapping objects are now current-time aware. Subsequently, only upstream/downstream partitions existent at the current time are returned.
  • ExecuteJobResult was renamed to JobExecutionResult (ExecuteJobResult remains a deprecated alias)
  • New AssetSelection.key_prefixes method allows matching asset keys starting with a provided prefix.
  • [dagster-airflow] persistent database URI can now be passed via environment variable
  • [dagster-azure] New ConfigurablePickledObjectADLS2IOManager that uses pythonic config
  • [dagster-fivetran] Fivetran connectors that are broken or incomplete are now ignored
  • [dagster-gcp] New DataProcResource follows the Pythonic resource system. The existing dataproc_resource remains supported.
  • [dagster-k8s] The K8sRunLauncher and k8s_job_executor will now retry the api call to create a Kubernetes Job when it gets a transient error code (500, 503, 504, or 401).
  • [dagster-snowflake] The SnowflakeIOManager now supports private_keys that have been base64 encoded to avoid issues with newlines in the private key. Non-base64 encoded keys are still supported. See the SnowflakeIOManager documentation for more information on base64 encoded private keys.
  • [ui] Unpartitioned assets show up on the backfill page
  • [ui] On the experimental runs page you can open the “view all tags” dialog of a row by pressing the hotkey ‘t’ while hovering that row.
  • [ui] The “scroll-to-pan” feature flag has been removed, and scroll-to-pan is now default functionality.

Bugfixes

  • The server side polling for events during a live run has had its rate adjusted and no longer uses a fixed interval.
  • [dagster-postgres] Fixed an issue where primary key constraints were not being created for the kvs, instance_info, and daemon_hearbeats table for existing Postgres storage instances that were migrating from before 1.2.2. This should unblock users relying on the existence of a primary key constraint for replication.
  • Fixed a bug that could cause incorrect counts to be shown for missing asset partitions when partitions are in progress
  • Fixed an issue within SensorResult evaluation where multipartitioned run requests containing a dynamic partition added in a dynamic partitions request object would raise an invalid partition key error.
  • [ui] When trying to terminate a queued or in-progress run from a Run page, forcing termination was incorrectly given as the only option. This has been fixed, and these runs can now be terminated normally.
  • [ui] Fixed an issue on the asset job partitions page where an infinite recursion error would be thrown when using TimeWindowPartitionMapping.
  • [dagster-databricks] Polling for the status of skipped Databricks runs now properly terminates.

Deprecations

  • ExecuteJobResult is now a deprecated alias for the new name, JobExecutionResult.

Community Contributions

  • [dagster-airbyte] When supplying an airbyte_resource to load_assets_from_connections , you may now provide an instance of the AirbyteResource class, rather than just airbyte_resource.configured(...) (thanks @joel-olazagasti!)
  • [dagster-airbyte] Fixed an issue connecting to destinations that support normalization (thanks @nina-j!)
  • Fix an error in the docs code snippets for IO managers (thanks out-running-27!)
  • Added an example to show how to build the Dagster's Software-Defined Assets for an analytics workflow with different deployments for a local and prod environment. (thanks @PedramNavid!)
  • [dagster-celery] Fixed an issue where the dagster-celery CLI accepted an inconsistent configuration format - it now matches the same format as the celery_executor. Thanks @boenshao!

Documentation

  • New “Managing your own I/O” tutorial section and other minor tutorial improvements.

Dagster Cloud

  • The ECS agent will now display task logs and other debug information when a code location fails to start up.
  • You can now set ecs_timeout in your ECS user code launcher config to extend how long the ECS agent polls for new code servers to start. Extending this timeout is useful if your code server takes an unusually long time to start up - for example, because it uses a very large image.
  • Added support for running the Dagster Cloud Kubernetes agent in a cluster using istio.
dagster - 1.3.3 (core) / 0.19.3 (libraries)

Published by elementl-devtools over 1 year ago

New

  • load_assets_from_package_module and the other core load_assets_from_ methods now accept a source_key_prefix argument, which allows applying a key prefix to all the source assets that are loaded.

  • OpExecutionContext now has an asset_partitions_time_window_for_input method.

  • RunFailureSensorContext now has a get_step_failure_events method.

  • The Pythonic resource system now supports a set of lifecycle hooks which can be used to manage setup and teardown:

    class MyAPIClientResource(ConfigurableResource):
        api_key: str
        _internal_client: MyAPIClient = PrivateAttr()
    
        def setup_for_execution(self, context):
            self._internal_client = MyAPIClient(self.api_key)
    
        def get_all_items(self):
            return self._internal_client.items.get()
    
  • Added support for specifying input and output config on ConfigurableIOManager.

  • QueuedRunCoordinator and SubmitRunContext are now exposed as public dagster exports.

  • [ui] Downstream cross-location dependencies of all source assets are now visible on the asset graph. Previously these dependencies were only displayed if the source asset was defined as a regular asset.

  • [ui] A new filtering experience is available on the Runs page after enabling feature flag “Experimental Runs table view with filtering”.

  • [dagster-aws] Allow the S3 compute log manager to specify a show_url_only: true config option, which will display a URL to the S3 file in dagit, instead of the contents of the log file.

  • [dagster-aws] PickledObjectS3IOManager now fully supports loading partitioned inputs.

  • [dagster-azure] PickedObjectADLS2IOManager now fully supports loading partitioned inputs.

  • [dagster-gcp] New GCSResource and ConfigurablePickledObjectGCSIOManager follow the Pythonic resource system. The existing gcs_resource and gcs_pickle_io_manager remain supported.

  • [dagster-gcp] New BigQueryResource follows the Pythonic resource system. The existing bigquery_resource remains supported.

  • [dagster-gcp] PickledObjectGCSIOManager now fully supports loading partitioned inputs.

  • [dagster-postgres] The event watching implementation has been moved from listen/notify based to the polling watcher used by MySQL and SQLite.

  • [dagster-slack] Add monitor_all_repositories to make_slack_on_run_failure_sensor, thanks @danielgafni!

  • [dagster-snowflake] New SnowflakeResource follows the Pythonic resource system. The existing snowflake_resource remains supported.

Bugfixes

  • Multi-asset sensor context methods for partitions now work when partitioned source assets are targeted.
  • Previously, the asset backfill page would incorrectly display negative counts for assets with upstream failures. This has been fixed.
  • In cases where there is an asset which is upstream of one asset produced by a subsettable multi-asset, but downstream of another, Dagster will automatically subset the multi-asset to resolve the underlying cycle in the op dependency graph. In some cases, this process could omit some of the op dependencies, resulting in incorrect execution order. This has been fixed.
  • Fixed an issue with AssetMetadataValue.value that would cause an infinite recursion error.
  • Fixed an issue where observable source assets would show up in the asset graph of jobs that did not include them.
  • Fixed an issue with directly invoking an op or asset with a Pythonic config object with a discriminated union did not work properly.
  • Fixes a bug where sensors attached to jobs that rely on resources from Definitions were not provided with the required resource definition.

Dagster Cloud

  • volumes and volumeMounts values have been added to the agent helm chart.

Experimental

  • [dagster-airbyte] load_assets_from_airbyte_instance and load_assets_from_airbyte_project now take a connection_to_auto_materialize_policy_fn for setting AutoMaterializePolicys on Airbyte assets
  • [dagster-airbyte] Introduced experimental support for Airbyte Cloud. See the using Dagster with Airbyte Cloud docs for more information.

Documentation

  • Ever wanted to know more about the files in Dagster projects, including where to put them in your project? Check out the new Dagster project files reference for more info!
  • We’ve made some improvements to the sidenav / information architecture of our docs!
    • The Guides section now contains several new categories, including Working with data assets and Working with tasks
    • The Community section is now under About
  • The Backfills concepts page now includes instructions on how to launch backfills that target ranges of partitions in a single run.
dagster - 1.3.2 (core) / 0.19.2 (libraries)

Published by elementl-devtools over 1 year ago

New

  • Added performance improvements for yielding time-partitioned run requests.
  • The asset backfill page now displays targeted assets in topological order.
  • Replicas can now be specified on Hybrid ECS and K8s agents. In ECS, use the NumReplicas parameter on the agent template in CloudFormation, or the dagsterCloudAgent.replicas field in Helm.
  • Zero-downtime agent updates can now be configured for the ECS agent. Just set the enableZeroDowntimeDeploys parameter to true in the CloudFormation stack for your agent.
  • The AssetsDefinition.from_graph, as well as the@graph_asset and @graph_multi_asset decorators now support specifying AutoMaterializePolicys.
  • [dagstermill] Pythonic resource variant of the dagstermill I/O manager is now available.
  • [dagster-duckdb] New DuckDBResource for connecting to and querying DuckDB databases.
  • [ui] Sensor / Schedule overview pages now allow you to select and start/stop multiple sensors/schedules at once.
  • [ui] Performance improvements to global search for big workspaces.

Bugfixes

  • async def ops/assets no longer prematurely finalize async generators during execution.
  • In some cases, the AutoMaterialize Daemon (and the build_asset_reconciliation_sensor) could incorrectly launch new runs for partitions that already had an in-progress run. This has been fixed.

Breaking Changes

  • Yielding run requests for experimental dynamic partitions via run_request_for_partition now throws an error. Instead, users should yield directly instantiated run requests via RunRequest(partition_key=...).
  • graph_asset and graph_multi_asset now support specifying resource_defs directly (thanks @kmontag42)!

Community Contributions

  • A new node_info_to_auto_materialize_policy_fn param added to load_assets_from_dbt_* functions. (thanks @askvinni)!
  • Added partition_key field to RunStatusSensorContext (thanks @pdstrnadJC)!

Experimental

  • For multi-partitioned assets with a time dimension, the auto-materialize policy now only kicks off materializations for the latest time partition window. Previously, all partitions would be targeted.
  • Added performance improvements to the multi-asset sensor context’s latest_materialization_records_by_key method.
  • The GraphQL API for launching a backfill no longer errors when the backfill targets assets instead of a job and the allPartitions argument is provided.

Documentation

  • Fixed a few typos in various guides.
  • Fixed a formatting issue in the Automating pipelines guide that was causing a 404.
dagster - 1.3.1 (core) / 0.19.1 (libraries)

Published by elementl-devtools over 1 year ago

New

  • Performance improvements when evaluating time-partitioned run requests within sensors and schedules.
  • [ui] Performance improvements when loading the asset catalog and launchpad for deployments with many time-partitioned assets.

Bugfixes

  • Fixed an issue where loading a Definitions object that included sensors attached to multiple jobs would raise an error.
  • Fixed a bug in which Pythonic resources would produce underlying resource values that would fail reference equality checks. This would lead to a conflicting resource version error when using the same Pythonic resource in multiple places.
dagster - 1.3.0 (core) / 0.19.0 (libraries) "Smooth Operator"

Published by elementl-devtools over 1 year ago

Major Changes since 1.2.0 (core) / 0.18.0 (libraries)

Core

  • Auto-materialize policies replace the asset reconciliation sensor - We significantly renovated the APIs used for specifying which assets are scheduled declaratively. Compared to build_asset_reconciliation_sensors , AutoMaterializePolicys work across code locations, as well as allow you to customize the conditions under which each asset is auto-materialized. [docs]
  • Asset backfill page - A new page in the UI for monitoring asset backfills shows the progress of each asset in the backfill.
  • Clearer labels for tracking changes to data and code - Instead of the opaque “stale” indicator, Dagster’s UI now indicates whether code, upstream data, or dependencies have changed. When assets are in violation of their FreshnessPolicys, Dagster’s UI now marks them as “overdue” instead of “late”.
  • Auto-materialization and observable source assets - Assets downstream of an observable source asset now use the source asset observations to determine whether upstream data has changed and assets need to be materialized.
  • Pythonic Config and Resources - The set of APIs introduced in 1.2 is no longer experimental [community memo]. Examples, integrations, and documentation have largely ported to the new APIs. Existing resources and config APIs will continue to be supported for the foreseeable future. Check out migration guide to learn how to incrementally adopt the new APIs.

Docs

  • Improved run concurrency docs - You asked (in support), and we answered! This new guide is a one-stop-shop for understanding and implementing run concurrency, whether you’re on Dagster Cloud or deploying to your own infrastructure.
  • Additions to the Intro to Assets tutorial - We’ve added two new sections to the assets tutorial, focused on scheduling and I/O. While we’re close to wrapping things up for the tutorial revamp, we still have a few topics to cover - stay tuned!
  • New guide about building machine learning pipelines - Many of our users learn best by example - this guide is one way we’re expanding our library of examples. In this guide, we walk you through building a simple machine learning pipeline using Dagster.
  • Re-organized Dagster Cloud docs - We overhauled how the Dagster Cloud docs are organized, bringing them more in line with the UI.

Since 1.2.7 (core) / 0.18.7 (libraries)

New

  • Long-running runs can now be terminated after going over a set runtime. See the run termination docs to learn more.
  • Adds a performance improvement to partition status caching for multi-partitioned assets containing a time dimension.
  • [ui] Asset groups are now included in global search.
  • [ui] Assets in the asset catalog have richer status information that matches what is displayed on the asset graph.
  • [dagster-aws] New AthenaClientResource, ECRPublicResource, RedshiftClientResource, S3Resource, S3FileManagerResource, ConfigurablePickledObjectS3IOManager, SecretsManagerResource follow Pythonic resource system. The existing APIs remain supported.
  • [dagster-datadog] New DatadogResource follows Pythonic resource system. The existing datadog_resource remains supported.
  • [dagster-ge] New GEContextResource follows Pythonic resource system. The existing ge_context_resource remains supported.
  • [dagster-github] New GithubResource follows Pythonic resource system. The existing github_resource remains supported.
  • [dagster-msteams] New MSTeamsResource follows Pythonic resource system. The existing msteams_resource remains supported.
  • [dagster-slack] New SlackResource follows Pythonic resource system. The existing slack_resource remains supported.

Bugfixes

  • Fixed an issue where using pdb.set_trace no longer worked when running Dagster locally using dagster dev or dagit.
  • Fixed a regression where passing custom metadata on @asset or Out caused an error to be thrown.
  • Fixed a regression where certain states of the asset graph would cause GQL errors.
  • [ui] Fixed a bug where assets downstream of source assets would sometimes incorrectly display a “New data” (previously “stale”) tag for assets with materializations generated from ops (as opposed to SDA materializations).
  • [ui] Fixed a bug where URLs for code locations named pipelines or jobs could lead to blank pages.
  • [ui] When configuring a partition-mapped asset backfill, helpful context no longer appears nested within the “warnings” section
  • [ui] For observable source assets,the asset sidebar now shows a “latest observation” instead of a “latest materialization”

Breaking Changes

  • By default, resources defined on Definitions are now automatically bound to jobs. This will only result in a change in behavior if you a) have a job with no "io_manager" defined in its resource_defs and b) have supplied an IOManager with key "io_manager" to the resource_defs argument of your Definitions. Prior to 1.3.0, this would result in the job using the default filesystem-based IOManager for the key "io_manager". In 1.3.0, this will result in the "io_manager" supplied to your Definitions being used instead. The BindResourcesToJobs wrapper, introduced in 1.2 to simulate this behavior, no longer has any effect.
  • [dagster-celery-k8s] The default kubernetes namespace for run pods when using the Dagster Helm chart with the CeleryK8sRunLauncher is now the same namespace as the Helm chart, instead of the default namespace. To restore the previous behavior, you can set the celeryK8sRunLauncher.jobNamespace field to the string default.
  • [dagster-snowflake-pandas] Due to a longstanding issue storing Pandas Timestamps in Snowflake tables, the SnowflakePandasIOManager has historically converted all timestamp data to strings before storing it in Snowflake. Now, it will instead ensure that timestamp data has a timezone, and if not, attach the UTC timezone. This allows the timestamp data to be stored as timestamps in Snowflake. If you have been storing timestamp data using the SnowflakePandasIOManager you can set the store_timestamps_as_strings=True configuration to continue storing timestamps as strings. For more information, and instructions for migrating Snowflake tables to use timestamp types, see the Migration Guide.

Changes to experimental APIs

  • Pythonic Resources and Config
    • Enabled passing RunConfig to many APIs which previously would only accept a config dictionary.
    • Enabled passing raw Python objects as resources to many APIs which previously would only accept ResourceDefinition.
    • Added the ability to pass execution config when constructing a RunConfig object.
    • Introduced more clear error messages when trying to mutate state on a Pythonic config or resource object.
    • Improved direct invocation experience for assets, ops, schedules and sensors using Pythonic config and resources. Config and resources can now be passed directly as args or kwargs.
  • The minutes_late and previous_minutes_late properties on the experimental FreshnesPolicySensorContext have been renamed to minutes_overdue and previous_minutes_overdue, respectively.

Removal of deprecated APIs

  • [previously deprecated, 0.15.0] metadata_entries arguments to event constructors have been removed. While MetadataEntry still exists and will only be removed in 2.0, it is no longer passable to any Dagster public API — users should always pass a dictionary of metadata values instead.

Experimental

  • Adds a performance improvement to the multi-asset sensor context’s latest_materialization_records_by_key function.

Documentation

  • The Google BigQuery tutorial and reference pages have been updated to use the new BigQueryPandasIOManager and BigQueryPySparkIOManager.
  • The Snowflake tutorial and reference pages have been updated to use the new SnowflakePandasIOManager and SnowflakePySparkIOManager.

Dagster Cloud

  • Previously, when deprovisioning an agent, code location servers were cleaned up in serial. Now, they’re cleaned up in parallel.
dagster - 1.2.7 (core) / 0.18.7 (libraries)

Published by elementl-devtools over 1 year ago

New

  • Resource access (via both required_resource_keys and Pythonic resources) are now supported in observable source assets.
  • [ui] The asset graph now shows how many partitions of each asset are currently materializing, and blue bands appear on the partition health bar.
  • [ui] Added a new page to monitor an asset backfill.
  • [ui] Performance improvement for Runs page for runs that materialize large numbers of assets.
  • [ui] Performance improvements for Run timeline and left navigation for users with large numbers of jobs or assets.
  • [ui] In the run timeline, consolidate “Ad hoc materializations” rows into a single row.
  • [dagster-dbt] Python 3.10 is now supported.
  • [dagster-aws] The EcsRunLauncher now allows you to customize volumes and mount points for the launched ECS task. See the API docs for more information.
  • [dagster-duckdb, dagster-duckdb-pandas, dagster-duckdb-pyspark] New DuckDBPandasIOManager and DuckDBPySparkIOManager follow Pythonic resource system. The existing duckdb_pandas_io_manager and duckdb_pyspark_io_manager remain supported.
  • [dagster-gcp, dagster-gcp-pandas, dagster-gcp-pyspark] New BigQueryPandasIOManager and BigQueryPySparkIOManager follow Pythonic resource system. The existing bigquery_pandas_io_manager and bigquery_pyspark_io_manager remain supported.
  • [dagster-gcp] The BigQuery resource now accepts authentication credentials as configuration. If you pass GCP authentication credentials to gcp_crentials , a temporary file to store the credentials will be created and the GOOGLE_APPLICATION_CREDENTIALS environment variable will be set to the temporary file. When the BigQuery resource is garbage collected, the environment variable will be unset and the temporary file deleted.
  • [dagster-snowflake, dagster-snowflake-pandas, dagster-snowflake-pyspark] New SnowflakePandasIOManager and SnowflakePySparkIOManager follow Pythonic resource system. The existing snowflake_pandas_io_manager and snowflake_pyspark_io_manager remain supported.

Bugfixes

  • Fixed an issue where dagster dev would periodically emit a harmless but annoying warning every few minutes about a gRPC server being shut down.
  • Fixed a schedule evaluation error that occurred when schedules returned a RunRequest(partition_key=...) object.
  • Fixed a bug that caused errors in the asset reconciliation sensor when the event log includes asset materializations with partitions that aren’t part of the asset’s PartitionsDefinition.
  • Fixed a bug that caused errors in the asset reconciliation sensor when a partitioned asset is removed.
  • Fixed an issue where run_request_for_partition would incorrectly raise an error for a job with a DynamicPartitionsDefinition that was defined with a function.
  • Fixed an issue where defining a partitioned job with unpartitioned assets via define_asset_job would raise an error.
  • Fixed a bug where source asset observations could not be launched from dagit when the asset graph contained partitioned assets.
  • Fixed a bug that caused __ASSET_JOB has no op named ... errors when using automatic run retries.
  • [ui] The asset partition health bar now correctly renders partial failed partitions of multi-dimensional assets in a striped red color.
  • [ui] Fixed an issue where steps that were skipped due to an upstream dependency failure were incorrectly listed as “Preparing” in the right-hand column of the runs timeline.
  • [ui] Fixed markdown base64 image embeds.
  • [ui] Guard against localStorage quota errors when storing launchpad config tabs.
  • [dagster-aws] Fixed an issue where the EcsRunLauncher would fail to launch runs if the use_current_ecs_task_config field was set to False but no task_definition field was set.
  • [dagster-k8s] Fixed an issue introduced in 1.2.6 where older versions of the kubernetes Python package were unable to import the package.

Community Contributions

  • The EcsRunLauncher now allows you to set a capacity provider strategy and customize the ephemeral storage used for launched ECS tasks. See the docs for details. Thanks AranVinkItility!
  • Fixed an issue where freshness policies were not being correctly applied to assets with key prefixes defined via AssetsDefinition.from_op. Thanks @tghanken for the fix!
  • Added the minimum_interval_seconds parameter to enable customizing the evaluation interval on the slack run failure sensor, thanks @ldnicolasmay!
  • Fixed a docs example and updated references, thanks @NicolasPA!

Experimental

  • The Resource annotation for Pythonic resource inputs has been renamed to ResourceParam in preparation for the release of the feature in 1.3.
  • When invoking ops and assets that request resources via parameters directly, resources can now be specified as arguments.
  • Improved various error messages related to Pythonic config and resources.
  • If the Resources Dagit feature flag is enabled, they will now show up in the overview page and search.

Documentation

dagster - 1.2.6 (core) / 0.18.6 (libraries)

Published by elementl-devtools over 1 year ago

Bugfixes

  • Fixed a GraphQL resolution error which occurred when retrieving metadata for step failures in the event log.

1.2.5 (core) / 0.18.5 (libraries)

New

  • materialize and materialize_to_memory now both accept a selection argument that allows specifying a subset of assets to materialize.
  • MultiPartitionsDefinition is no longer marked experimental.
  • Context methods to access time window partition information now work for MultiPartitionsDefinitions with a time dimension.
  • Improved the performance of the asset reconciliation sensor when a non-partitioned asset depends on a partitioned asset.
  • load_assets_from_package_module and similar methods now accept a freshness_policy, which will be applied to all loaded assets.
  • When the asset reconciliation sensor is scheduling based on freshness policies, and there are observable source assets, the observed versions now inform the data time of the assets.
  • build_sensor_context and build_multi_asset_sensor_context can now take a Definitions object in place of a RepositoryDefinition
  • [UI] Performance improvement for loading asset partition statuses.
  • [dagster-aws] s3_resource now accepts use_ssl and verify configurations.

Bugfixes

  • Fixed a bug that caused an error to be raised when passing a multi-asset into the selection argument on define_asset_job.
  • Fixes a graphQL error that displays on Dagit load when an asset’s partitions definition is change from a single-dimensional partitions definition to a MultiPartitionsDefinition.
  • Fixed a bug that caused backfills to fail when spanning assets that live in different code locations.
  • Fixed an error that displays when a code location with a MultiPartitionsMapping (experimental) is loaded.
  • Fixed a bug that caused errors with invalid TimeWindowPartitionMappings to not be bubbled up to the UI.
  • Fixed an issue where the scheduler would sometimes incorrectly handle spring Daylight Savings Time transitions for schedules running at 2AM in a timezone other than UTC.
  • Fixed an issue introduced in the 1.2.4 release where running pdb stopped working when using dagster dev.
  • Fixed an issue where it is was possible to create AssetMaterialization objects with a null AssetKey.
  • Previously, if you had a TimeWindowPartitionsDefinition with a non-standard cron schedule, and also provided a minute_of_hour or similar argument in build_schedule_from_partitioned_job. Dagster would silently create the wrong cron expression. It now raises an error.
  • The asset reconciliation sensor now no longer fails when the event log contains materializations that contain partitions that aren’t contained in the asset’s PartitionsDefinition. These partitions are now ignored.
  • Fixed a regression that prevented materializing dynamically partitioned assets from the UI (thanks @planvin!)
  • [UI] On the asset graph, the asset health displayed in the sidebar for the selected asset updates as materializations and failures occur.
  • [UI] The asset partitions page has been adjusted to make materialization and observation event metadata more clear.
  • [UI] Large table schema metadata entries now display within a modal rather than taking up considerable space on the page.
  • [UI] Launching a backfill of a partitioned asset with unpartitioned assets immediately upstream no longer shows the “missing partitions” warning.
  • [dagster-airflow] fixed a bug in the PersistentAirflowDatabase where versions of airflow from 2.0.0 till 2.3.0 would not use the correct connection environment variable name.
  • [dagster-census] fixed a bug with the poll_sync_run function ofdagster-census that prevented polling from working correctly (thanks @ldincolasmay!)

Deprecations

  • The run_request_for_partition method on JobDefinition and UnresolvedAssetJobDefinition is now deprecated and will be removed in 2.0.0. Instead, directly instantiate a run request with a partition key via RunRequest(partition_key=...).

Documentation

  • Added a missing link to next tutorial section (Thanks Mike Kutzma!)
dagster - 1.2.4 (core) / 0.18.4 (libraries)

Published by elementl-devtools over 1 year ago

New

  • Further performance improvements to the asset reconciliation sensor.
  • Performance improvements to asset backfills with large numbers of partitions.
  • New AssetsDefinition.to_source_assets to method convert a set of assets to SourceAsset objects.
  • (experimental) Added partition mapping that defines dependency relationships between different MultiPartitionsDefinitions.
  • [dagster-mlflow] Removed the mlflow pin from the dagster-mlflow package.
  • [ui] Syntax highlighting now supported in rendered markdown code blocks (from metadata).

Bugfixes

  • When using build_asset_reconciliation_sensor, in some cases duplicate runs could be produced for the same partition of an asset. This has been fixed.

  • When using Pythonic configuration for resources, aliased field names would cause an error. This has been fixed.

  • Fixed an issue where context.asset_partitions_time_window_for_output threw an error when an asset was directly invoked with build_op_context.

  • [dagster-dbt] In some cases, use of ephemeral dbt models could cause the dagster representation of the dbt dependency graph to become incorrect. This has been fixed.

  • [celery-k8s] Fixed a bug that caused JSON deserialization errors when an Op or Asset emitted JSON that doesn't represent a DagsterEvent.

  • Fixed an issue where launching a large backfill while running dagster dev would sometimes fail with a connection error after running for a few minutes.

  • Fixed an issue where dagster dev would sometimes hang when running Dagster code that attempted to read in input via stdin.

  • Fixed an issue where runs that take a long time to import code would sometimes continue running even after they were stopped by run monitoring for taking too long to start.

  • Fixed an issue where AssetSelection.groups() would simultaneously select both source and regular assets and consequently raise an error.

  • Fixed an issue where BindResourcesToJobs would raise errors encapsulating jobs which had config specified at definition-time.

  • Fixed Pythonic config objects erroring when omitting optional values rather than specifying None.

  • Fixed Pythonic config and resources not supporting Enum values.

  • DagsterInstance.local_temp and DagsterInstance.ephemeral now use object instance scoped local artifact storage temporary directories instead of a shared process scoped one, removing a class of thread safety errors that could manifest on initialization.

  • Improved direct invocation behavior for ops and assets which specify resource dependencies as parameters, for instance:

    class MyResource(ConfigurableResource):
        pass
    
    @op
    def my_op(x: int, y: int, my_resource: MyResource) -> int:
        return x + y
    
    my_op(4, 5, my_resource=MyResource())
    
  • [dagster-azure] Fixed an issue with an AttributeError being thrown when using the async DefaultAzureCredential (thanks @mpicard)

  • [ui] Fixed an issue introduced in 1.2.3 in which no log levels were selected by default when viewing Run logs, which made it appear as if there were no logs at all.

Deprecations

  • The environment_vars argument to ScheduleDefinition is deprecated (the argument is currently non-functional; environment variables no longer need to be whitelisted for schedules)

Community Contributions

  • Typos fixed in [CHANGES.md](http://CHANGES.md) (thanks @fridiculous)
  • Links to telemetry docs fixed (thanks @Abbe98)
  • --path-prefix can now be supplied via Helm chart (thanks @mpicard)

Documentation

  • New machine learning pipeline with Dagster guide
  • New example of multi-asset conditional materialization
  • New tutorial section about scheduling
  • New images on the Dagster README

All Changes

https://github.com/dagster-io/dagster/compare/1.2.3...1.2.4

  • f361ef7 - [refactor] delete Materialization (#13030) by @smackesey
  • 7268f46 - Add in progress subsets to the partition cache (#13045) by @johannkm
  • 0032e2c - Add multipartitioned assets with dynamic dimension to toys (#13061) by @clairelin135
  • 89c4ed1 - add docs example for multi-asset conditional materialization (#13054) by @sryza
  • 2bd9a12 - Add docs for source asset observation jobs/schedules (#13062) by @smackesey
  • 971010b - Revert "Add in progress subsets to the partition cache (#13045)" by @johannkm
  • 40a569a - [asset-reconciliation][bug] Fix issue where overly-aggressive runs would be kicked off. (#13069) by @OwenKephart
  • 2edf1ee - tweaks to cross-repo-assets toy (#12973) by @sryza
  • 2addaa5 - Show unauthorized error graphql error message (#13064) by @salazarm
  • ece6c4d - [dagster-io/ui] Make Suggest component a bit more flexible (#13056) by @hellendag
  • d8e98d9 - Fix disabled state for launchpad button submenu (#13078) by @salazarm
  • 296dabb - Re-enable in progress subsets in the partition cache (#13082) by @johannkm
  • dce7bc6 - Add dynamic partitions name resolver to dimension type (#13070) by @clairelin135
  • 60b5e20 - asset sensor test docs (#13065) by @prha
  • 9cafa85 - Fix issue where backfill fails when gRPC server is replaced mid-backfill (#13085) by @gibsondan
  • d86e3a5 - Use instance from sensor/schedule context to instantiate resources, delay until accessed (#13041) by @benpankow
  • e20af6b - Add materializing subset to asset gql (#13046) by @johannkm
  • 90a92bf - feat(helm): add path-prefix to dagit command (#13080) by @mpicard
  • 160f3ec - Use dynamic partition definition name for dimension of multipartition definition (#13090) by @salazarm
  • e86a59c - [typing/static] Fix @repository decorator typing (#12295) by @smackesey
  • 61cc090 - [instance] make local artifact directory scheme thread safe (#13043) by @alangenfeld
  • d1e75d0 - 1.2.3 changelog (#13094) by @jamiedemaria
  • 5fd2bb6 - Ensure pyright venvs use statically legible editable installs (#13089) by @smackesey
  • e780948 - [docs] - Remove finished code from dbt tutorial template (#13091) by @erinkcochran87
  • 961a9f8 - [ui] Upgrade react-markdown (#13092) by @hellendag
  • 1c60da6 - Fix submitting backfills synchronously from graphql (#13093) by @gibsondan
  • dfbabb4 - Test get and set serialized_in_progress_partition_subset (#13063) by @johannkm
  • 6dc4e0c - ExitStack.pop_all -> close (#13050) by @alangenfeld
  • 299cd81 - Automation: versioned docs for 1.2.3 by @elementl-devtools
  • 804e113 - Fraser/rework readme (#12565) by @frasermarlow
  • ed8bc7b - [ui] Use DefaultLogLevels when there is no level state stored (#13109) by @hellendag
  • 5646656 - Set stdin to DEVNULL when opening dagster subprocesses (#13099) by @gibsondan
  • 6ff1cd9 - add a vercel github action to build docs/storybook previews (#13052) by @prha
  • d5db43f - [refactor] Remove frozen{list,dict,tags} classes (#12293) by @smackesey
  • bd4408b - Docs for setting up Gitlab CI, branch deployment guide (#12998) by @prha
  • bb0601a - Add assets def to op context (#13088) by @clairelin135
  • b73a36e - make AssetsDefinition.to_source_assets public (#13073) by @sryza
  • 0b745c3 - [docs] New ML pipeline guide PR (#13100) by @odette-elementl
  • 0d2c1a7 - [freshness-refactor][3/n] Update methods on the CachingDataTimeResolver to work with scalar data time (#12906) by @OwenKephart
  • f18cede - Fixing refs to images in the README (#13126) by @tacastillo
  • d7e906e - restrict vercel builds based on paths (#13129) by @prha
  • f707541 - fix missing snapshots (#13134) by @OwenKephart
  • 7f9d8f4 - Telemetry for dynamic partitions (#12605) by @clairelin135
  • a5c572b - fix missing snapshots (again) (#13136) by @OwenKephart
  • fd195cc - [freshness-refactor][4/n] Simplify scheduling algorithm (#13019) by @OwenKephart
  • 3d6f822 - Deprecate environment_vars argument to ScheduleDefinition, @schedule (#13044) by @smackesey
  • 157f80e - [refactor] delete hourly/daily/weekly/monthly schedule decorators, PartitionScheduleDefinition, build_schedule_from_partition (#13006) by @smackesey
  • 1d0a09e - [refactor] simplify dependency dict typing (#12521) by @smackesey
  • f3af63b - celery-k8s executor: handle stdout that's valid json but not a dagster event (#13143) by @johannkm
  • 3e41d51 - fix typo in CHANGES.md (#13140) by @fridiculous
  • 009573b - Fix branch deployment docs (#13131) by @dpeng817
  • b6bd87a - Update multiple agents docs (#13135) by @dpeng817
  • 1e5d825 - add toy for eager asset reconciliation (#13066) by @sryza
  • cddf96a - trying again by moving the images to the same directory as the readme (#13127) by @tacastillo
  • 4f0dddd - [tech][templates] moving the scaffold project's asset loader outside of the defs (#13103) by @tacastillo
  • bb470bc - [docs][tutorial-revamp] Adding a section for scheduling to the tutorial (#13101) by @tacastillo
  • 3720e26 - replacing corrupt .png image (#13157) by @frasermarlow
  • e6d28a6 - [dagster-dbt] Fix bug when calculating transitive dependencies (#13128) by @OwenKephart
  • 38059c4 - deploy storybook to prod when landing pushes on master (#13159) by @prha
  • 8d8d30a - [dagster-azure] fix: AttributeError: 'coroutine' object has no attribute 'token' (#13110) by @mpicard
  • 42131b0 - Add support for more asset tags (#13153) by @braunjj
  • c3fcacd - [caching-refactor] Remove use of get_and_update_asset_status_cache. (#13151) by @OwenKephart
  • 7ba99aa - [asset-reconciliation][perf] Cache common properties on the TimeWindowPartitionsDefinition (#12981) by @OwenKephart
  • 06e6ba5 - MultiPartitionMapping (#12950) by @clairelin135
  • 602eeb7 - [asset-reconciliation] Fix issue with duplicate runs for partitions with in-progress materializations (#13130) by @OwenKephart
  • 07f6569 - update telemetry documentation links (#13176) by @Abbe98
  • 5550452 - [structured config] respect field aliases for resources (#13177) by @jamiedemaria
  • 8e93585 - [refactor] Remove PartitionSetDefinition (#13145) by @smackesey
  • d4a3258 - TableMetadataValue toy (#13124) by @smackesey
  • 616b613 - fix repository decorator typing (#13180) by @smackesey
  • 1c64b59 - add storybook for @core in addition to @ui (#13184) by @prha
  • b8c20ca - feat(dbt): follow dbt Core's version support constraints (#13189) by @rexledesma
  • 18a8939 - Fix case where a pipeline load takes long enough that run monitoring kills it (#13156) by @gibsondan
  • 417c919 - Allow newlines in error messages in Dagit (#13193) by @gibsondan
  • 410c9d6 - Make sure we always wait for all grpc server processes to spin down in tests (#13146) by @gibsondan
  • 6ae3ddc - unpin mlflow from dagster-mlflow (#13194) by @gibsondan
  • 0ec55e4 - Greatly reduce the number of gRPC calls when doing large asset backfills (#13086) by @gibsondan
  • 305f07c - don't run mlflow tests on python 3.11 (#13197) by @gibsondan
  • cd6c386 - [pythonic config] Treat Optional as both not-required and Noneable (#12975) by @benpankow
  • 7f27f3b - [pythonic config] Add support for enums (#12979) by @benpankow
  • f494b38 - [pythonic resources] Make direct invocation of ops/assets w/ resources easier (#13002) by @benpankow
  • 93db1f4 - [bugfix] make AssetSelection.groups() resolve to only regular assets (#13196) by @smackesey
  • 613fba7 - [fix] Fix behavior of BindResourcesToJobs when jobs have config specified (#13057) by @benpankow
  • 9851c54 - [serdes] remove extraneous @cached_method use (#13206) by @alangenfeld
  • 7c045dd - 1.2.4 changelog (#13226) by @smackesey
  • 2cf765c - 1.2.4 by @elementl-devtools
  • f361ef7 - [refactor] delete Materialization (#13030) by @smackesey
  • 7268f46 - Add in progress subsets to the partition cache (#13045) by @johannkm
  • 0032e2c - Add multipartitioned assets with dynamic dimension to toys (#13061) by @clairelin135
  • 89c4ed1 - add docs example for multi-asset conditional materialization (#13054) by @sryza
  • 2bd9a12 - Add docs for source asset observation jobs/schedules (#13062) by @smackesey
  • 971010b - Revert "Add in progress subsets to the partition cache (#13045)" by @johannkm
  • 40a569a - [asset-reconciliation][bug] Fix issue where overly-aggressive runs would be kicked off. (#13069) by @OwenKephart
  • 2edf1ee - tweaks to cross-repo-assets toy (#12973) by @sryza
  • 2addaa5 - Show unauthorized error graphql error message (#13064) by @salazarm
  • ece6c4d - [dagster-io/ui] Make Suggest component a bit more flexible (#13056) by @hellendag
  • d8e98d9 - Fix disabled state for launchpad button submenu (#13078) by @salazarm
  • 296dabb - Re-enable in progress subsets in the partition cache (#13082) by @johannkm
  • dce7bc6 - Add dynamic partitions name resolver to dimension type (#13070) by @clairelin135
  • 60b5e20 - asset sensor test docs (#13065) by @prha
  • 9cafa85 - Fix issue where backfill fails when gRPC server is replaced mid-backfill (#13085) by @gibsondan
  • d86e3a5 - Use instance from sensor/schedule context to instantiate resources, delay until accessed (#13041) by @benpankow
  • e20af6b - Add materializing subset to asset gql (#13046) by @johannkm
  • 90a92bf - feat(helm): add path-prefix to dagit command (#13080) by @mpicard
  • 160f3ec - Use dynamic partition definition name for dimension of multipartition definition (#13090) by @salazarm
  • e86a59c - [typing/static] Fix @repository decorator typing (#12295) by @smackesey
  • 61cc090 - [instance] make local artifact directory scheme thread safe (#13043) by @alangenfeld
  • d1e75d0 - 1.2.3 changelog (#13094) by @jamiedemaria
  • 5fd2bb6 - Ensure pyright venvs use statically legible editable installs (#13089) by @smackesey
  • e780948 - [docs] - Remove finished code from dbt tutorial template (#13091) by @erinkcochran87
  • 961a9f8 - [ui] Upgrade react-markdown (#13092) by @hellendag
  • 1c60da6 - Fix submitting backfills synchronously from graphql (#13093) by @gibsondan
  • dfbabb4 - Test get and set serialized_in_progress_partition_subset (#13063) by @johannkm
  • 6dc4e0c - ExitStack.pop_all -> close (#13050) by @alangenfeld
  • 299cd81 - Automation: versioned docs for 1.2.3 by @elementl-devtools
  • 804e113 - Fraser/rework readme (#12565) by @frasermarlow
  • ed8bc7b - [ui] Use DefaultLogLevels when there is no level state stored (#13109) by @hellendag
  • 5646656 - Set stdin to DEVNULL when opening dagster subprocesses (#13099) by @gibsondan
  • 6ff1cd9 - add a vercel github action to build docs/storybook previews (#13052) by @prha
  • d5db43f - [refactor] Remove frozen{list,dict,tags} classes (#12293) by @smackesey
  • bd4408b - Docs for setting up Gitlab CI, branch deployment guide (#12998) by @prha
  • bb0601a - Add assets def to op context (#13088) by @clairelin135
  • b73a36e - make AssetsDefinition.to_source_assets public (#13073) by @sryza
  • 0b745c3 - [docs] New ML pipeline guide PR (#13100) by @odette-elementl
  • 0d2c1a7 - [freshness-refactor][3/n] Update methods on the CachingDataTimeResolver to work with scalar data time (#12906) by @OwenKephart
  • f18cede - Fixing refs to images in the README (#13126) by @tacastillo
  • d7e906e - restrict vercel builds based on paths (#13129) by @prha
  • f707541 - fix missing snapshots (#13134) by @OwenKephart
  • 7f9d8f4 - Telemetry for dynamic partitions (#12605) by @clairelin135
  • a5c572b - fix missing snapshots (again) (#13136) by @OwenKephart
  • fd195cc - [freshness-refactor][4/n] Simplify scheduling algorithm (#13019) by @OwenKephart
  • 3d6f822 - Deprecate environment_vars argument to ScheduleDefinition, @schedule (#13044) by @smackesey
  • 157f80e - [refactor] delete hourly/daily/weekly/monthly schedule decorators, PartitionScheduleDefinition, build_schedule_from_partition (#13006) by @smackesey
  • 1d0a09e - [refactor] simplify dependency dict typing (#12521) by @smackesey
  • f3af63b - celery-k8s executor: handle stdout that's valid json but not a dagster event (#13143) by @johannkm
  • 3e41d51 - fix typo in CHANGES.md (#13140) by @fridiculous
  • 009573b - Fix branch deployment docs (#13131) by @dpeng817
  • b6bd87a - Update multiple agents docs (#13135) by @dpeng817
  • 1e5d825 - add toy for eager asset reconciliation (#13066) by @sryza
  • cddf96a - trying again by moving the images to the same directory as the readme (#13127) by @tacastillo
  • 4f0dddd - [tech][templates] moving the scaffold project's asset loader outside of the defs (#13103) by @tacastillo
  • bb470bc - [docs][tutorial-revamp] Adding a section for scheduling to the tutorial (#13101) by @tacastillo
  • 3720e26 - replacing corrupt .png image (#13157) by @frasermarlow
  • e6d28a6 - [dagster-dbt] Fix bug when calculating transitive dependencies (#13128) by @OwenKephart
  • 38059c4 - deploy storybook to prod when landing pushes on master (#13159) by @prha
  • 8d8d30a - [dagster-azure] fix: AttributeError: 'coroutine' object has no attribute 'token' (#13110) by @mpicard
  • 42131b0 - Add support for more asset tags (#13153) by @braunjj
  • c3fcacd - [caching-refactor] Remove use of get_and_update_asset_status_cache. (#13151) by @OwenKephart
  • 7ba99aa - [asset-reconciliation][perf] Cache common properties on the TimeWindowPartitionsDefinition (#12981) by @OwenKephart
  • 06e6ba5 - MultiPartitionMapping (#12950) by @clairelin135
  • 602eeb7 - [asset-reconciliation] Fix issue with duplicate runs for partitions with in-progress materializations (#13130) by @OwenKephart
  • 07f6569 - update telemetry documentation links (#13176) by @Abbe98
  • 5550452 - [structured config] respect field aliases for resources (#13177) by @jamiedemaria
  • 8e93585 - [refactor] Remove PartitionSetDefinition (#13145) by @smackesey
  • d4a3258 - TableMetadataValue toy (#13124) by @smackesey
  • 616b613 - fix repository decorator typing (#13180) by @smackesey
  • 1c64b59 - add storybook for @core in addition to @ui (#13184) by @prha
  • b8c20ca - feat(dbt): follow dbt Core's version support constraints (#13189) by @rexledesma
  • 18a8939 - Fix case where a pipeline load takes long enough that run monitoring kills it (#13156) by @gibsondan
  • 417c919 - Allow newlines in error messages in Dagit (#13193) by @gibsondan
  • 410c9d6 - Make sure we always wait for all grpc server processes to spin down in tests (#13146) by @gibsondan
  • 6ae3ddc - unpin mlflow from dagster-mlflow (#13194) by @gibsondan
  • 0ec55e4 - Greatly reduce the number of gRPC calls when doing large asset backfills (#13086) by @gibsondan
  • 305f07c - don't run mlflow tests on python 3.11 (#13197) by @gibsondan
  • cd6c386 - [pythonic config] Treat Optional as both not-required and Noneable (#12975) by @benpankow
  • 7f27f3b - [pythonic config] Add support for enums (#12979) by @benpankow
  • f494b38 - [pythonic resources] Make direct invocation of ops/assets w/ resources easier (#13002) by @benpankow
  • 93db1f4 - [bugfix] make AssetSelection.groups() resolve to only regular assets (#13196) by @smackesey
  • 613fba7 - [fix] Fix behavior of BindResourcesToJobs when jobs have config specified (#13057) by @benpankow
  • 9851c54 - [serdes] remove extraneous @cached_method use (#13206) by @alangenfeld
  • 7c045dd - 1.2.4 changelog (#13226) by @smackesey
  • 2cf765c - 1.2.4 by @elementl-devtools
dagster - 1.2.3 (core) / 0.18.3 (libraries)

Published by elementl-devtools over 1 year ago

  • Jobs defined via define_asset_job now auto-infer their partitions definitions if not explicitly defined.
  • Observable source assets can now be run as part of a job via define_asset_job. This allows putting them on a schedule/sensor.
  • Added an instance property to the HookContext object that is passed into Op Hook functions, which can be used to access the current DagsterInstance object for the hook.
  • (experimental) Dynamic partitions definitions can now exist as dimensions of multi-partitions definitions.
  • [dagster-pandas] New create_table_schema_metadata_from_dataframe function to generate a TableSchemaMetadataValue from a Pandas DataFrame. Thanks @AndyBys!
  • [dagster-airflow] New option for setting dag_run configuration on the integration’s database resources.
  • [ui] The asset partitions page now links to the most recent failed or in-progress run for the selected partition.
  • [ui] Asset descriptions have been moved to the top in the asset sidebar.
  • [ui] Log filter switches have been consolidated into a single control, and selected log levels will be persisted locally so that the same selections are used by default when viewing a run.
  • [ui] You can now customize the hour formatting in timestamp display: 12-hour, 24-hour, or automatic (based on your browser locale). This option can be found in User Settings.

Bugfixes

  • In certain situations a few of the first partitions displayed as “unpartitioned” in the health bar despite being materialized. This has now been fixed, but users may need to run dagster asset wipe-partitions-status-cache to see the partitions displayed.
  • Starting 1.1.18, users with a gRPC server that could not access the Dagster instance on user code deployments would see an error when launching backfills as the instance could not instantiate. This has been fixed.
  • Previously, incorrect partition status counts would display for static partitions definitions with duplicate keys. This has been fixed.
  • In some situations, having SourceAssets could prevent the build_asset_reconciliation_sensor from kicking off runs of downstream assets. This has been fixed.
  • The build_asset_reconciliation_sensor is now much more performant in cases where unpartitioned assets are upstream or downstream of static-partitioned assets with a large number of partitions.
  • [dagster-airflow] Fixed an issue were the persistent Airflow DB resource required the user to set the correct Airflow database URI environment variable.
  • [dagster-celery-k8s] Fixed an issue where run monitoring failed when setting the jobNamespace field in the Dagster Helm chart when using the CeleryK8sRunLauncher.
  • [ui] Filtering on the asset partitions page no longer results in keys being presented out of order in the left sidebar in some scenarios.
  • [ui] Launching an asset backfill outside an asset job page now supports partition mapping, even if your selection shares a partition space.
  • [ui] In the run timeline, date/time display at the top of the timeline was sometimes broken for users not using the en-US browser locale. This has been fixed.
dagster - 1.2.2 (core) / 0.18.2 (libraries)

Published by elementl-devtools over 1 year ago

New

  • Dagster is now tested on Python 3.11.

  • Users can now opt in to have resources provided to Definitions bind to their jobs. Opt in by wrapping your job definitions in BindResourcesToJobs. This will become the default behavior in the future.

    @op(required_resource_keys={"foo")
    def my_op(context)
        print(context.foo)
    
    @job
    def my_job():
      my_op()
    
    defs = Definitions(
        jobs=BindResourcesToJobs([my_job])
        resources={"foo": foo_resource}
    
  • Added dagster asset list and dagster asset materialize commands to Dagster’s command line interface, for listing and materializing software-defined assets.

  • build_schedule_from_partitioned_job now accepts jobs partitioned with a MultiPartitionsDefinition that have a time-partitioned dimension.

  • Added SpecificPartitionsPartitionMapping, which allows an asset, or all partitions of an asset, to depend on a specific subset of the partitions in an upstream asset.

  • load_asset_value now supports SourceAssets.

  • [ui] Ctrl+K has been added as a keyboard shortcut to open global search.

  • [ui] In the run logs table, the timestamp column has been moved to the far left, which will hopefully allow for better visual alignment with op names and tags.

  • [dagster-dbt] A new node_info_to_definition_metadata_fn to load_assets_from_dbt_project and load_assets_from_dbt_manifest allows custom metadata to be attached to the asset definitions generated from these methods.

  • [dagster-celery-k8s] The Kubernetes namespace that runs using the CeleryK8sRunLauncher are launched in can now be configured by setting the jobNamespace field in the Dagster Helm chart under celeryK8sRunLauncherConfig.

  • [dagster-gcp] The BigQuery I/O manager now accepts timeout configuration. Currently, this configuration will only be applied when working with Pandas DataFrames, and will set the number of seconds to wait for a request before using a retry.

  • [dagster-gcp] [dagster-snowflake] [dagster-duckdb] The BigQuery, Snowflake, and DuckDB I/O managers now support self-dependent assets. When a partitioned asset depends on a prior partition of itself, the I/O managers will now load that partition as a DataFrame. For the first partition in the dependency sequence, an empty DataFrame will be returned.

  • [dagster-k8s] k8s_job_op now supports running Kubernetes jobs with more than one pod (Thanks @Taadas).

Bugfixes

  • Fixed a bug that causes backfill tags that users set in the UI to not be included on the backfill runs, when launching an asset backfill.
  • Fixed a bug that prevented resume from failure re-execution for jobs that contained assets and dynamic graphs.
  • Fixed an issue where the asset reconciliation sensor would issue run requests for assets that were targeted by an active asset backfill, resulting in duplicate runs.
  • Fixed an issue where the asset reconciliation sensor could issue runs more frequently than necessary for assets with FreshnessPolicies having intervals longer than 12 hours.
  • Fixed an issue where AssetValueLoader.load_asset_value() didn’t load transitive resource dependencies correctly.
  • Fixed an issue where constructing a RunConfig object with optional config arguments would lead to an error.
  • Fixed the type annotation on ScheduleEvaluationContext.scheduled_execution_time to not be Optional.
  • Fixed the type annotation on OpExecutionContext.partition_time_window ****(thanks @elben10).
  • InputContext.upstream_output.log is no longer None when loading a source asset.
  • Pydantic type constraints are now supported by the Pythonic config API.
  • An input resolution bug that occurred in certain conditions when composing graphs with same named ops has been fixed.
  • Invoking an op with collisions between positional args and keyword args now throws an exception.
  • async def ops are now invoked with asyncio.run.
  • TimeWindowPartitionDefinition now throws an error at definition time when passed an invalid cron schedule instead of at runtime.
  • [ui] Previously, using dynamic partitions with assets that required config would raise an error in the launchpad. This has been fixed.
  • [dagster-dbt] Previously, setting a cron_schedule_timezone inside of the config for a dbt model would not result in that property being set on the generated FreshnessPolicy. This has been fixed.
  • [dagster-gcp] Added a fallback download url for the GCSComputeLogManager when the session does not have permissions to generate signed urls.
  • [dagster-snowflake] In a previous release, functionality was added for the Snowflake I/O manager to attempt to create a schema if it did not already exist. This caused an issue when the schema already existed but the account did not have permission to create the schema. We now check if a schema exists before attempting to create it so that accounts with restricted permissions do not error, but schemas can still be created if they do not exist.

Breaking Changes

  • validate_run_config no longer accepts pipeline_def or mode arguments. These arguments refer to legacy concepts that were removed in Dagster 1.0, and since then there have been no valid values for them.

Experimental

  • Added experimental support for resource requirements in sensors and schedules. Resources can be specified using required_resource_keys and accessed through the context or specified as parameters:

    @sensor(job=my_job, required_resource_keys={"my_resource"})
    def my_sensor(context):
        files_to_process = context.my_resource.get_files()
    		...
    
    @sensor(job=my_job)
    def my_sensor(context, my_resource: MyResource):
        files_to_process = my_resource.get_files()
    		...
    

Documentation

  • Added a page on asset selection syntax to the Concepts documentation.

All Changes

https://github.com/dagster-io/dagster/compare/1.2.1...1.2.2

  • fb5de00 - [dagster-gcp-pandas] add timeout config (#12637) by @jamiedemaria
  • b91626c - [docs][tutorial-revamp] Basic tutorial revamp parts 1 through 4 (#12509) by @tacastillo
  • f49ec38 - feat(dbt): add support for --debug (#12722) by @rexledesma
  • 5753806 - Default useful Dagster helm chart features to on (#12737) by @gibsondan
  • 589df01 - [For 1.2] Allow both protobuf 3 and 4 in dagster (#12466) by @gibsondan
  • 953e377 - [toys repo] Export partitioned assets toys (#12733) by @salazarm
  • c9ace2c - [pythonic config] Allow using 'resource_defs' with resource args in assets (#12679) by @benpankow
  • bfc222c - [refactor] Remove symbols deprecated until 1.2 (#12360) by @smackesey
  • 5321cfc - [dagster-snowflake] fix inconsistencies in snowflake resource (#12633) by @jamiedemaria
  • cce3edf - [ui] Update permissions for launching jobs and materializing assets (#12681) by @hellendag
  • 2636f63 - [For 1.2] Change default run monitoring settings (#11512) by @gibsondan
  • 8009dc8 - [For 1.2] Don't include run/job tags in k8s_job_ops k8s config computations (#12345) by @gibsondan
  • 5655fc4 - Remove Partitioned Schedules from docs, fix dagit staleStatusCauses mocks (#12742) by @smackesey
  • aecc22c - [Sensor Testing] Add "Test again" button (#12735) by @salazarm
  • 3e06e5f - [1.2.0] [refactor] Delete DagsterTypeMaterializer (#12516) by @smackesey
  • 2431f5e - change DynamicPartitionsDefinition.__repr__ (#12754) by @sryza
  • a5fdaa8 - Recommended Project Structure Guide (#12656) by @odette-elementl
  • cb54b6f - [pythonic config] Add test showcasing use of Pydantic validators (#12536) by @benpankow
  • e10b0b3 - fix serialization of TableRecord (#10731) by @sryza
  • 3915c7a - [dagster-gcp-pandas] revert flaky test (#12748) by @jamiedemaria
  • 6322734 - [typing/static] PartitionsDefinition covariant type var (#12284) by @smackesey
  • b67f436 - [rename] LogicalVersion -> DataVersion (#12500) by @smackesey
  • aac63ae - [rename] LogicalVersion -> DataVersion in gql/dagit (#12501) by @smackesey
  • f10592f - [rename] Rename LogicalVersion -> DataVersion in docs (#12503) by @smackesey
  • ad9d771 - [dagster-airflow] add make_persistent_airflow_db_resource (#12305) by @Ramshackle-Jamathon
  • 54cbe5f - [ui] Remove usePermissionsDEPRECATED (#12771) by @hellendag
  • cb8b44c - [Asset Details Page] Auto-select partition in materialization dialog. (#12734) by @salazarm
  • 204fec1 - feat(databricks)!: remove create_databricks_job_op (#12600) by @rexledesma
  • 383307e - Make permissions_for_location require keyword args (#12774) by @gibsondan
  • 8d696fa - Remove DynamicPartitionsDefinitions API methods (#12744) by @clairelin135
  • a5e9f77 - Set default memory/CPU for new ECS tasks based on runtime platform (#12767) by @gibsondan
  • a428709 - cron_schedule validation for time window partitions (#12761) by @prha
  • 0181a64 - update some guide titles and descriptions (#12775) by @sryza
  • 909a7a7 - add primary keys to all run tables (#12711) by @prha
  • e13e20c - document PartitionMappings in partitions concepts page (#12768) by @sryza
  • 6531476 - make sure kv/daemon_heartbeat queries explicitly enumerate columns (#12789) by @prha
  • fe6f7c4 - update API docs for freshness policies (#12788) by @OwenKephart
  • 6bcec7d - [apidoc] asset_selection repository -> Definitions (#11303) by @yuhan
  • b2ba3d5 - Fix typing on mem_io_manager (#12791) by @dpeng817
  • 28821f1 - [docs] add note about biquery temp tables (#12747) by @jamiedemaria
  • c2c365d - Remove deprecated MetadataEntry constructors, update entry_data internal refs (#12724) by @smackesey
  • 2f1500e - [dagster-airflow] persistent db docs (#12485) by @Ramshackle-Jamathon
  • 3f8b46d - [pythonic resources][fix] Fix inheriting attributes when extending Pythonic resources (#12781) by @benpankow
  • 1c4f5cb - MetadataEntryUnion (#12725) by @smackesey
  • 3811ff7 - [docs] - Update RBAC docs for Cloud (#12752) by @erinkcochran87
  • d53888d - [dagit] Update the asset partitions / events view after run failures (#12798) by @bengotow
  • 24ed3bd - [dagit] Asset node rendering tweaks (#12764, #12765) (#12794) by @bengotow
  • b508302 - Reflect dynamically-added partitions in asset sidebar immediately (#12778) by @salazarm
  • 06d4499 - [Create Partition Dialog] Autofocus input (#12777) by @salazarm
  • a6914da - Remove PartitionMetadataEntry (#12726) by @smackesey
  • a23f2c7 - Maintain a separate cursor in the asset partition cache pointing back to the earliest in progress run (#12782) by @johannkm
  • b032715 - [ui] Run logs: make Timestamp the first column (#12805) by @hellendag
  • ddddc00 - [snowflake-pyspark] refactor spark session creation in tests (#12738) by @jamiedemaria
  • c402ec0 - [db io managers] refactor tests to test table creation (#12739) by @jamiedemaria
  • 590b725 - [dagit] Fix yaml editor help context when indented on a new line (#12537) (#12804) by @bengotow
  • 1908b11 - make context.upstream_output.log work for source assets (#12787) by @sryza
  • 9553d00 - [typing/runtime] serdes (#12523) by @smackesey
  • af55f6f - [dagit] Show “Observe” on source asset details pages, not “Materialize" (#12803) by @bengotow
  • 90f2c7d - [ui] Disable "Open in launchpad" based on permissions (#12817) by @hellendag
  • 0c2c592 - Add a way to cause partition health data to refetch without passing around refetch callbacks (#12822) by @salazarm
  • 4128578 - [ui] Fix space oddities on Schedules overview table (#12810) by @hellendag
  • 8b6bec5 - [ESLint] Update missing graphql variable type rule to check mutations and subscriptions too (#12779) by @salazarm
  • 1b4fedc - Test get and set new asset cache fields (#12820) by @johannkm
  • 173bdd7 - [duckdb-polars] fix readme (#12499) by @jamiedemaria
  • 73eede0 - [freshness-refactor][1/n] Separate part of CachingInstanceQueryer into CachingDataTimeResolver (#12809) by @OwenKephart
  • eab9b33 - [ui] Affordance for copying asset key (#12829) by @hellendag
  • 0c36c23 - [fix][dagit] Allow resources page to scroll (#12827) by @benpankow
  • 85fb26f - [async op] use asyncio.run (#12785) by @alangenfeld
  • c7a1f7f - [dagster-airflow] remove double encoding of timezones (#12811) by @Ramshackle-Jamathon
  • d7b7189 - [ui] Support Ctrl+K to trigger search (#12840) by @hellendag
  • 9aea76c - Virtualize TagSelector component (#12841) by @salazarm
  • 4caeb4e - [Launchpad] Allow adding new dynamic partition from launchpad UI (#12757) by @salazarm
  • eff1d9e - [tox] download latest pip (#12821) by @alangenfeld
  • 6788d37 - [dagster-snowflake] fix permissions issue when creating schemas (#12802) by @jamiedemaria
  • 2507015 - [pythonic config] Special rendering for env vars in UI (#12446) by @benpankow
  • a3b50ff - nothing input toy (#12849) by @sryza
  • f4b5a34 - latestRunForPartition gql (#12845) by @johannkm
  • 622eefc - [dagster-dbt] add support for node_info_to_definition_metadata_fn (#12831) by @OwenKephart
  • 81c7843 - Remove nothing types from asset graph (#12848) by @salazarm
  • 9a391a0 - [TagSelector] Allow filtering tags in dropdown (#12844) by @salazarm
  • 8caf6ee - Add a tooltip on asset nodes in the graph (#12853) by @salazarm
  • 3988f44 - [pythonic resources] Keep track of utilized env vars in external repo data (#12557) by @benpankow
  • 615268c - Changelog 1.2.0 (#12851) by @dpeng817
  • dab8b87 - add a local download url fallback for GCS (#12815) by @prha
  • 796279a - [freshness-refactor][2/n] Reorganize/Rename/Remove/Document CachingInstanceQueryer methods (#12813) by @OwenKephart
  • d464c1a - fix type annotation for partition_time_window (#12850) by @sryza
  • ac47dc2 - Revert "fix type annotation for partition_time_window" (#12860) by @sryza
  • c60f46b - Automation: versioned docs for 1.2.0 by @elementl-devtools
  • b8014b9 - Add volatility property to CachingStaleStatusResolver (#12855) by @smackesey
  • 8f4d41b - Update CHANGES.md (#12864) by @Ramshackle-Jamathon
  • e18607e - [docs] - Add doc checklist to PR template (#12861) by @erinkcochran87
  • a4320d6 - Fix partition_time_window type error (#12838) by @elben10
  • 2421fd0 - Add 1.2.0 changes to migration.md (#12866) by @clairelin135
  • 7ebb44a - Fix config w/ dynamic partitions error (#12710) by @clairelin135
  • b3fcd49 - [1.2] make ScheduleEvaluationContext.scheduled_execution_time non-optional (#12623) by @sryza
  • 2f53acb - asset materialize CLI (#12691) by @sryza
  • b04cb79 - [refactor] serdes pack/deserialize APIs (#12524) by @smackesey
  • 4d14935 - [refactor] MayHaveInstanceWeakRef error on access non-existent _instance (#12553) by @smackesey
  • f49dc0a - asset list command (#12790) by @sryza
  • 3e0d5b1 - fixups to config_schedule.py (#12874) by @sryza
  • f350d4b - Fix some upcoming lint errors in docstrings (#12704) by @charliermarsh
  • cf997c0 - [db io managers] support self dependent assets in db io managers (#12700) by @jamiedemaria
  • fa086c7 - Fix markdown links in changelog (#12877) by @johannkm
  • 237b8c6 - Fix dagster-k8s job execution when a job has more than 1 pod (#12731) by @Taadas
  • 845d1e9 - take grpcio-health-checking pins out of backcompat suite (#12876) by @gibsondan
  • ca8714f - Fix ruff (#12879) by @gibsondan
  • 2ffd661 - need to explicitly return a known existing column for updates (#12883) by @prha
  • ddd58e5 - Fix docs build (#12884) by @smackesey
  • 75cb437 - add 1.2.1 changelog (#12887) by @prha
  • 63bf6b7 - add SpecificPartitionsPartitionMapping (#12878) by @sryza
  • e8eddef - Automation: versioned docs for 1.2.1 by @elementl-devtools
  • 31dc8bd - [docs] - Apply style guide to Asset versioning and caching guide (#12895) by @erinkcochran87
  • 12a3d70 - [dagster-ui] Sync the search filter to the query string on most pages (#12899) by @bengotow
  • 84ed448 - on asset backfills, propagate backfill tags to runs (#12886) by @sryza
  • 03cbce6 - Support SourceAssets in load_asset_value (#12859) by @sryza
  • 630fb5c - Fix transitive resource deps for asset value loader (#12872) by @dpeng817
  • 49180eb - [typing/static] telemetry (#12714) by @smackesey
  • 6f0d5b6 - add docs page on asset selection syntax (#12871) by @sryza
  • f301c86 - Add dropdown for materializing using launchpad (#12852) by @salazarm
  • 3d9cc2d - [ui] Refactor permissions objects (#12903) by @hellendag
  • c105fea - avoid custom PartitionMapping on IO manager concept page (#12902) by @sryza
  • e088a26 - [bk] Add changelog check step (#12865) by @benpankow
  • 8af8cab - tutorial intro tweaks (#12780) by @sryza
  • 530469f - Move docs style guidelines from pull request template to contribution guide (#12875) by @sryza
  • 36a4192 - remove support for legacy APIs from validate_run_config (#12881) by @sryza
  • 3d9ba7f - make reconciliation perf test names easier to scan, and turn them all on (#12908) by @sryza
  • 78c6654 - Make fetch_flattened_time_window_ranges support arbitrary statuses (#12863) by @johannkm
  • 339419a - Build schedule from multipartitioned job (#12907) by @clairelin135
  • 63184c4 - [ui] Clean up temporary permission code (#12923) by @hellendag
  • cbf85f6 - [op] error on multiple values for argument (#12473) by @alangenfeld
  • f8e23a6 - test assets in dynamic re-execution (#12651) by @alangenfeld
  • 2d8874b - Support requiring resources in sensors, experimentally (#12401) by @benpankow
  • 8f8fac7 - Support requiring resources in schedules, experimentally (#12697) by @benpankow
  • 00b4ada - [pythonic config] support Pydantic constrained types (#12703) by @benpankow
  • 8b8c4b2 - [docs] Add env var docs to Pythonic config experimental docs (#12740) by @benpankow
  • 430d929 - [pythonic resources] Surface utilized env vars in GraphQL (#12558) by @benpankow
  • af4fca6 - [dagit] Allow the lineage tab to appear before the asset definition has loaded (#12191) (#12814) by @bengotow
  • f468ed5 - Support defaulting to selected partitions for Multi-partitioned Assets (#12856) by @salazarm
  • 84031b3 - fix execution plan step output collisions (#12799) by @alangenfeld
  • 811eb80 - [pythonic config] Fix using RunConfig with optional config fields (#12933) by @benpankow
  • e47d172 - [Partitions] Use OrdinalPartitionSelector for STATIC partitioned assets (#12919) by @salazarm
  • 4c53596 - fix ruff (#12936) by @alangenfeld
  • 57f0b7f - Unpin grpcio on dagster for python 3.11 (#12917) by @gibsondan
  • d5c3200 - Allow setting the job_namespace for the celery k8s run launcher (#12911) by @gibsondan
  • 169777d - [dagster-dbt] add cron_schedule_timezone to set of properties parsed from dbt config (#12904) by @OwenKephart
  • 67ecc96 - Redesign partition backfill modal (#12928) by @salazarm
  • d43a5ce - Fix error reporter infinite loop (#12941) by @salazarm
  • ad03463 - Add Python 3.11 to dagster buildkite (#12723) by @gibsondan
  • cb775d4 - Bind top-level resources to jobs at Definitions-time (#12430) by @benpankow
  • d342604 - [asset-reconciliation] Do not reconcile partitions involved in active asset backfills (#12926) by @OwenKephart
  • 2ae1d72 - [asset-reconciliation] Fix behavior that could cause overly-aggressive updates (#12940) by @OwenKephart
  • c13f13a - improve execution plan snapshot use in execute_run (#12660) by @alangenfeld
  • f2ff644 - open backfill options section by default (#12957) by @salazarm
  • 5a03457 - [Partition Backfill Dialog] Update backfill options spacing (#12959) by @salazarm
  • aea2c8b - Resolve transitive resource deps for schedules and sensors (#12946) by @benpankow
  • 4a34e5b - changelog 1.2.2 (#12960) by @yuhan
  • c9ca27f - Fix pandera tests (#12947) by @smackesey
  • 1e4fc7e - [dagit] Use pure-asset backfills instead of hidden asset job backfills (#12948) by @bengotow
  • 3b6c517 - 1.2.2 by @elementl-devtools
dagster - 1.2.1 (core) / 0.18.1 (libraries)

Published by elementl-devtools over 1 year ago

Bugfixes

  • Fixed a bug with postgres storage where daemon heartbeats were failing on instances that had not been migrated with dagster instance migrate after upgrading to 1.2.0.
dagster - 1.2.0 (core) / 0.18.0 (libraries)

Published by Ramshackle-Jamathon over 1 year ago

Major Changes since 1.1.0 (core) / 0.17.0 (libraries)

Core

  • Added a new dagster dev command that can be used to run both Dagit and the Dagster daemon in the same process during local development. [docs]
  • Config and Resources
  • Repository > Definitions [docs]
  • Declarative scheduling
    • The asset reconciliation sensor is now 100x more performant in many situations, meaning that it can handle more assets and more partitions.
    • You can now set freshness policies on time-partitioned assets.
    • You can now hover over a stale asset to learn why that asset is considered stale.
  • Partitions
    • DynamicPartitionsDefinition allows partitioning assets dynamically - you can add and remove partitions without reloading your definitions (experimental). [docs]
    • The asset graph in the UI now displays the number of materialized, missing, and failed partitions for each partitioned asset.
    • Asset partitions can now depend on earlier time partitions of the same asset. Backfills and the asset reconciliation sensor respect these dependencies when requesting runs [example].
    • TimeWindowPartitionMapping now accepts start_offset and end_offset arguments that allow specifying that time partitions depend on earlier or later time partitions of upstream assets [docs].
  • Backfills
    • Dagster now allows backfills that target assets with different partitions, such as a daily asset which rolls up into a weekly asset, as long as the root assets in the selection are partitioned in the same way.
    • You can now choose to pass a range of asset partitions to a single run rather than launching a backfill with a run per partition [instructions].

Integrations

  • Weights and Biases - A new integration dagster-wandb with Weights & Biases allows you to orchestrate your MLOps pipelines and maintain ML assets with Dagster. [docs]
  • Snowflake + PySpark - A new integration dagster-snowflake-pyspark allows you to store and load PySpark DataFrames as Snowflake tables using the snowflake_pyspark_io_manager. [docs]
  • Google BigQuery - A new BigQuery I/O manager and new integrations dagster-gcp-pandas and dagster-gcp-pyspark allow you to store and load Pandas and PySpark DataFrames as BigQuery tables using the bigquery_pandas_io_manager and bigquery_pyspark_io_manager. [docs]
  • Airflow The dagster-airflow integration library was bumped to 1.x.x, with that major bump the library has been refocused on enabling migration from Airflow to Dagster. Refer to the docs for an in-depth migration guide.
  • Databricks - Changes:
    • Added op factories to create ops for running existing Databricks jobs (create_databricks_run_now_op), as well as submitting one-off Databricks jobs (create_databricks_submit_run_op).
    • Added a new Databricks guide.
    • The previous create_databricks_job_op op factory is now deprecated.

Docs

  • Automating pipelines guide - Check out the best practices for automating your Dagster data pipelines with this new guide. Learn when to use different Dagster tools, such as schedules and sensors, using this guide and its included cheatsheet.
  • Structuring your Dagster project guide - Need some help structuring your Dagster project? Learn about our recommendations for getting started and scaling sustainably.
  • Tutorial revamp - Goodbye cereals and hello HackerNews! We’ve overhauled our intro to assets tutorial to not only focus on a more realistic example, but to touch on more Dagster concepts as you build your first end-to-end pipeline in Dagster. Check it out here.

Stay tuned, as this is only the first part of the overhaul. We’ll be adding more chapters - including automating materializations, using resources, using I/O managers, and more - in the next few weeks.

Since 1.1.21 (core) / 0.17.21 (libraries)

New

  • Freshness policies can now be assigned to assets constructed with @graph_asset and @graph_multi_asset.
  • The project_fully_featured example now uses the built in DuckDB and Snowflake I/O managers.
  • A new “failed” state on asset partitions makes it more clear which partitions did not materialize successfully. The number of failed partitions is shown on the asset graph and a new red state appears on asset health bars and status dots.
  • Hovering over “Stale” asset tags in the Dagster UI now explains why the annotated assets are stale. Reasons can include more recent upstream data, changes to code versions, and more.
  • [dagster-airflow] support for persisting airflow db state has been added with make_persistent_airflow_db_resource this enables support for Airflow features like pools and cross-dagrun state sharing. In particular retry-from-failure now works for jobs generated from Airflow DAGs.
  • [dagster-gcp-pandas] The BigQueryPandasTypeHandler now uses google.bigquery.Client methods load_table_from_dataframe and query rather than the pandas_gbq library to store and fetch DataFrames.
  • [dagster-k8s] The Dagster Helm chart now only overrides args instead of both command and args for user code deployments, allowing to include a custom ENTRYPOINT in your the Dockerfile that loads your code.
  • The protobuf<4 pin in Dagster has been removed. Installing either protobuf 3 or protobuf 4 will both work with Dagster.
  • [dagster-fivetran] Added the ability to specify op_tags to build_fivetran_assets (thanks @Sedosa!)
  • @graph_asset and @graph_multi_asset now support passing metadata (thanks @askvinni)!

Bugfixes

  • Fixed a bug that caused descriptions supplied to @graph_asset and @graph_multi_asset to be ignored.
  • Fixed a bug that serialization errors occurred when using TableRecord.
  • Fixed an issue where partitions definitions passed to @multi_asset and other functions would register as type errors for mypy and other static analyzers.
  • [dagster-aws] Fixed an issue where the EcsRunLauncher failed to launch runs for Windows tasks.
  • [dagster-airflow] Fixed an issue where pendulum timezone strings for Airflow DAG start_date would not be converted correctly causing runs to fail.
  • [dagster-airbyte] Fixed an issue when attaching I/O managers to Airbyte assets would result in errors.
  • [dagster-fivetran] Fixed an issue when attaching I/O managers to Fivetran assets would result in errors.

Database migration

  • Optional database schema migrations, which can be run via dagster instance migrate:
    • Improves Dagit performance by adding a database index which should speed up job run views.
    • Enables dynamic partitions definitions by creating a database table to store partition keys. This feature is experimental and may require future migrations.
    • Adds a primary key id column to the kvs, daemon_heartbeats and instance_info tables, enforcing that all tables have a primary key.

Breaking Changes

  • The minimum grpcio version supported by Dagster has been increased to 1.44.0 so that Dagster can support both protobuf 3 and protobuf 4. Similarly, the minimum protobuf version supported by Dagster has been increased to 3.20.0. We are working closely with the gRPC team on resolving the upstream issues keeping the upper-bound grpcio pin in place in Dagster, and hope to be able to remove it very soon.

  • Prior to 0.9.19, asset keys were serialized in a legacy format. This release removes support for querying asset events serialized with this legacy format. Contact #dagster-support for tooling to migrate legacy events to the supported version. Users who began using assets after 0.9.19 will not be affected by this change.

  • [dagster-snowflake] The execute_queryand execute_queries methods of the SnowflakeResource now have consistent behavior based on the values of the fetch_results and use_pandas_result parameters. If fetch_results is True, the standard Snowflake result will be returned. If fetch_results and use_pandas_result are True, a pandas DataFrame will be returned. If fetch_results is False and use_pandas_result is True, an error will be raised. If both are False, no result will be returned.

  • [dagster-snowflake] The execute_queries command now returns a list of DataFrames when use_pandas_result is True, rather than appending the results of each query to a single DataFrame.

  • [dagster-shell] The default behavior of the execute and execute_shell_command functions is now to include any environment variables in the calling op. To restore the previous behavior, you can pass in env={} to these functions.

  • [dagster-k8s] Several Dagster features that were previously disabled by default in the Dagster Helm chart are now enabled by default. These features are:

    • The run queue (by default, without a limit). Runs will now always be launched from the Daemon.
    • Run queue parallelism - by default, up to 4 runs can now be pulled off of the queue at a time (as long as the global run limit or tag-based concurrency limits are not exceeded).
    • Run retries - runs will now retry if they have the dagster/max_retries tag set. You can configure a global number of retries in the Helm chart by setting run_retries.max_retries to a value greater than the default of 0.
    • Schedule and sensor parallelism - by default, the daemon will now run up to 4 sensors and up to 4 schedules in parallel.
    • Run monitoring - Dagster will detect hanging runs and move them into a FAILURE state for you (or start a retry for you if the run is configured to allow retries). By default, runs that have been in STARTING for more than 5 minutes will be assumed to be hanging and will be terminated.

    Each of these features can be disabled in the Helm chart to restore the previous behavior.

  • [dagster-k8s] The experimental k8s_job_op op and execute_k8s_job functions no longer automatically include configuration from a dagster-k8s/config tag on the Dagster job in the launched Kubernetes job. To include raw Kubernetes configuration in a k8s_job_op, you can set the container_config, pod_template_spec_metadata, pod_spec_config, or job_metadata config fields on the k8s_job_op (or arguments to the execute_k8s_job function).

  • [dagster-databricks] The integration has now been refactored to support the official Databricks API.

    • create_databricks_job_op is now deprecated. To submit one-off runs of Databricks tasks, you must now use the create_databricks_submit_run_op.
    • The Databricks token that is passed to the databricks_client resource must now begin with https://.

Changes to experimental APIs

  • [experimental] LogicalVersion has been renamed to DataVersion and LogicalVersionProvenance has been renamed to DataProvenance.
  • [experimental] Methods on the experimental DynamicPartitionsDefinition to add, remove, and check for existence of partitions have been removed. Refer to documentation for updated API methods.

Removal of deprecated APIs

  • [previously deprecated, 0.15.0] Static constructors on MetadataEntry have been removed.
  • [previously deprecated, 1.0.0] DagsterTypeMaterializer, DagsterTypeMaterializerContext, and @dagster_type_materializer have been removed.
  • [previously deprecated, 1.0.0] PartitionScheduleDefinition has been removed.
  • [previously deprecated, 1.0.0] RunRecord.pipeline_run has been removed (use RunRecord.dagster_run).
  • [previously deprecated, 1.0.0] DependencyDefinition.solid has been removed (use DependencyDefinition.node).
  • [previously deprecated, 1.0.0] The pipeline_run argument to build_resources has been removed (use dagster_run)

Community Contributions

  • Deprecated iteritems usage was removed and changed to the recommended items within dagster-snowflake-pandas (thanks @sethkimmel3)!
  • Refactor to simply the new @asset_graph decorator (thanks @simonvanderveldt)!

Experimental

  • User-computed DataVersions can now be returned on Output
  • Asset provenance info can be accessed via OpExecutionContext.get_asset_provenance

Documentation

  • The Asset Versioning and Caching Guide now includes a section on user-provided data versions
  • The community contributions doc block Picking a github issue was not correctly rendering, this has been fixed (thanks @Sedosa)!
dagster - 1.1.21 (core) / 0.17.21 (libraries)

Published by elementl-devtools over 1 year ago

New

  • Further performance improvements for build_asset_reconciliation_sensor.
  • Dagster now allows you to backfill asset selections that include mapped partition definitions, such as a daily asset which rolls up into a weekly asset, as long as the root assets in your selection share a partition definition.
  • Dagit now includes information about the cause of an asset’s staleness.
  • Improved the error message for non-matching cron schedules in TimeWindowPartitionMappings with offsets. (Thanks Sean Han!)
  • [dagster-aws] The EcsRunLauncher now allows you to configure the runtimePlatform field for the task definitions of the runs that it launches, allowing it to launch runs using Windows Docker images.
  • [dagster-azure] Add support for DefaultAzureCredential for adls2_resource (Thanks Martin Picard!)
  • [dagster-databricks] Added op factories to create ops for running existing Databricks jobs (create_databricks_run_now_op), as well as submitting one-off Databricks jobs (create_databricks_submit_run_op). See the new Databricks guide for more details.
  • [dagster-duckdb-polars] Added a dagster-duckdb-polars library that includes a DuckDBPolarsTypeHandler for use with build_duckdb_io_manager, which allows loading / storing Polars DataFrames from/to DuckDB. (Thanks Pezhman Zarabadi-Poor!)
  • [dagster-gcp-pyspark] New PySpark TypeHandler for the BigQuery I/O manager. Store and load your PySpark DataFrames in BigQuery using bigquery_pyspark_io_manager.
  • [dagster-snowflake] [dagster-duckdb] The Snowflake and DuckDB IO managers can now load multiple partitions in a single step - e.g. when a non-partitioned asset depends on a partitioned asset or a single partition of an asset depends on multiple partitions of an upstream asset. Loading occurs using a single SQL query and returns a single DataFrame.

Bugfixes

  • Previously, if an AssetSelection which matched no assets was passed into define_asset_job, the resulting job would target all assets in the repository. This has been fixed.
  • Fixed a bug that caused the UI to show an error if you tried to preview a future schedule tick for a schedule built using build_schedule_from_partitioned_job.
  • When a non-partitioned non-asset job has an input that comes from a partitioned SourceAsset, we now load all partitions of that asset.
  • Updated the fs_io_manager to store multipartitioned materializations in directory levels by dimension. This resolves a bug on windows where multipartitioned materializations could not be stored with the fs_io_manager.
  • Schedules and sensors previously timed out when attempting to yield many multipartitioned run requests. This has been fixed.
  • Fixed a bug where context.partition_key would raise an error when executing on a partition range within a single run via Dagit.
  • Fixed a bug that caused the default IO manager to incorrectly raise type errors in some situations with partitioned inputs.
  • [ui] Fixed a bug where partition health would fail to display for certain time window partitions definitions with positive offsets.
  • [ui] Always show the “Reload all” button on the code locations list page, to avoid an issue where the button was not available when adding a second location.
  • [ui] Fixed a bug where users running multiple replicas of dagit would see repeated Definitions reloaded messages on fresh page loads.
  • [ui] The asset graph now shows only the last path component of linked assets for better readability.
  • [ui] The op metadata panel now longer capitalizes metadata keys
  • [ui] The asset partitions page, asset sidebar and materialization dialog are significantly smoother when viewing assets with a large number of partitions (100k+)
  • [dagster-gcp-pandas] The Pandas TypeHandler for BigQuery now respects user provided location information.
  • [dagster-snowflake] ProgrammingError was imported from the wrong library, this has been fixed. Thanks @herbert-allium!

Experimental

  • You can now set an explicit logical version on Output objects rather than using Dagster’s auto-generated versions.
  • New get_asset_provenance method on OpExecutionContext allows fetching logical version provenance for an arbitrary asset key.
  • [ui] - you can now create dynamic partitions from the partition selection UI when materializing a dynamically partitioned asset

Documentation

All Changes

https://github.com/dagster-io/dagster/compare/1.1.20...1.1.21

  • 4343d59 - dagster-census api docs (#12413) by @yuhan
  • 24b7e9b - graph_asset and graph_multi_asset decorators (#10152) by @sryza
  • aa29161 - [dagster-snowflake-pyspark] fix bug loading partitions (#12472) by @jamiedemaria
  • 2197dec - add graphql fields for querying run tags (#12409) by @prha
  • 8889b48 - Add stale status causes (#11953) by @smackesey
  • 323cdc8 - fix (#12477) by @salazarm
  • 8e21900 - Update Contributing doc with instructions for ruff/pyright (#12481) by @smackesey
  • c472edb - [bigquery] mark bigquery io manager experimental (#12479) by @jamiedemaria
  • f2084d7 - add partial tag autocomplete for run filter input (#12410) by @prha
  • 93e7cc1 - Support env valueFrom in Helm chart (#12425) by @johannkm
  • 6601156 - Update GQL to expose StaleStatus and StaleStatusCause (#11952) by @smackesey
  • 0ffe4a7 - remove timestamp comparisons of code location entries to reduce OSS dagit replica spam (#12407) by @prha
  • 6af6f55 - Fix state status logical version test (#12484) by @smackesey
  • a15d965 - fix ruff (#12486) by @alangenfeld
  • 2c97adf - use opt_nullable_mapping for dagster library versions (#12487) by @alangenfeld
  • 965152d - clarify error when op is missing argument for In (#12456) by @sryza
  • 361b0ee - Remove existing RunConfig class (#12488) by @benpankow
  • 75b6a5c - [pythonic resources] Clean up initialization of env vars, treat resource objects as immutable (#12445) by @benpankow
  • e3c8825 - [structured config] Add support for Selectors w/ pydantic discriminated unions (#11280) by @benpankow
  • 2d43d39 - Allow setting logical version inside op (#12189) by @smackesey
  • 39f525a - Replace usages of nslookupwithnc for user deployments (#11033) by @michaeljguarino
  • 2492317 - Revert "Replace usages of nslookupwithnc for user deployments (#11033)" by @johannkm
  • b7fae37 - [pythonic resources] Last set of class renames (#12490) by @benpankow
  • 763283f - [dagster-azure] Add support for DefaultAzureCredential for adls2_resource (#11309) by @mpicard
  • 1d1f5ce - Add example of customizing task role and execution role arn to the ECS agent docs (#12491) by @gibsondan
  • c4e3e87 - add dagster-duckdb-polars library (#12197) by @pzarabadip
  • 3a05583 - [draft][pythonic config][docs] Introduce intro to Resources doc utilizing Pythonic resources (#12260) by @benpankow
  • efed336 - [pythonic config] Add structured RunConfig object for specifying runtime, job config (#11965) by @benpankow
  • 2f4a0e5 - [draft][pythonic config][docs] Introduce intro to Config doc utilizing Pythonic config (#12349) by @benpankow
  • 4439129 - 1.1.20 changelog (#12506) by @benpankow
  • 3dc1233 - refactor(databricks): lift polling methods up to the client (#12382) by @rexledesma
  • edac939 - [fix] fix sphinx airflow version parsing (#12507) by @benpankow
  • 215ae70 - Automation: versioned docs for 1.1.20 by @elementl-devtools
  • da35211 - [dagit] Expose range-based asset health, use it for partition status rendering (#12302) by @bengotow
  • e3fd740 - [dagit] Use range-based asset health for asset partitions / job partitions pages (#12434) by @bengotow
  • 63b4273 - Fix 1.1.20 changelog codeblock (#12525) by @johannkm
  • 2b1ad7c - [dagit] Delete generated GraphQL types before regenerating (#12518) by @hellendag
  • 024db65 - [refactor] Delete build_solid_context (#12513) by @smackesey
  • 18e7eb6 - [typing/static] serdes (#12522) by @smackesey
  • 3e30026 - [refactor] make ResolvedRunConfig.to_dict use "ops" (#12514) by @smackesey
  • 7e34083 - Improve multipartition performance for get_partition (#12431) by @clairelin135
  • e798edf - feat(databricks): override user agent in resource (#12526) by @rexledesma
  • 7036fca - Bump typing-extensions dep to >=4.4.0 (#12529) by @smackesey
  • 781a8ea - [docs] Snowflake reference page fixes (#12455) by @jamiedemaria
  • 2cc338b - [bugfix] Make assets downstream of partitions never stale in dagit (#12528) by @smackesey
  • 65e92a1 - Enable internal testing for writing asset cached status data (#12497) by @clairelin135
  • 281ec4b - [typing] fix typing in daemon tests (#12475) by @dpeng817
  • f596b2f - dynamic partitions toy (#12533) by @sryza
  • 7875f44 - pare down multi-partition runtime type checking in upath IO manager (#12508) by @sryza
  • 102c0bc - [dagit] Upgrade to Jest 29, allow more time for coverage collection (#12534) by @bengotow
  • a3b7dfc - Add description to invariant check (#12496) by @CodeMySky
  • e00ba72 - [fix] Correctly resolve asset jobs with empty selections (#12531) by @OwenKephart
  • 9d9b072 - [dagit] Support backfills on partition-mapped asset selections (#12458) by @bengotow
  • 4fc6e3e - [dagit] Remove unnecessary usage of <TestProvider> (#12519) by @bengotow
  • 31022e0 - Fix multipartitions w/ fs_io_manager on windows (#12414) by @clairelin135
  • dd4ba07 - Fix master (#12550) by @clairelin135
  • c93eb54 - Booleans that check permissions for specific objects and automatically incorporate per-location checks (#12548) by @gibsondan
  • e867fcc - docs(dagster-dbt): add example to retrieve asset keys from dbt selection (#12323) by @rexledesma
  • 704c638 - Fix IO manager doc snippets (#12545) by @clairelin135
  • 6f2af95 - fix setup.cfg in create package (#12559) by @jamiedemaria
  • 3a5ff3e - Fetch multipartition key from context methods (#12512) by @clairelin135
  • 7d2b6be - Include partition in asset materialization planned event (#12333) by @johannkm
  • 97fe5a0 - handle multiple non-time-window partitions in db io managers (#12517) by @sryza
  • 848c6e8 - Add runtime_platform to EcsRunLauncher (#12566) by @gibsondan
  • 2b22589 - Tag Selector Component (#12564) by @salazarm
  • 1c3633c - Storybook for LaunchAssetChoosePartitionsDialog (#12567) by @salazarm
  • 9ff5327 - Update setting-up-alerts.mdx (#12582) by @johannkm
  • 2726401 - Move useViewport to uiformcore (#12578) by @salazarm
  • 9a519d0 - [ui] Always show "Reload all" button (#12588) by @hellendag
  • a6b91b3 - Bump minimist from 1.2.5 to 1.2.8 in /js_modules/dagit (#12595) by @dependabot[bot]
  • 57830ae - Expose partition definition name (#12585) by @salazarm
  • 6a1696b - add dynamic partitions example (#12494) by @sryza
  • 2faf365 - [dagit] Do not capitalize tag keys in the Dagit op metadata panel (#12591) by @bengotow
  • bc130b8 - [graphql] empty numPartitions and partitionNames for cross-partitioning backfills (#12462) by @sryza
  • 14337e8 - [dagit] Use last path component of linked assets on the asset graph + add tooltips (#12590) by @bengotow
  • 116ec37 - [dagster-snowflake] Fix ProgrammingError import in snowflake_io_manager (#12576) by @herbert-allium
  • 9041d57 - remove dagster-cloud dep from dynamic partitions example (#12603) by @sryza
  • c8bd24a - fix future ticks for partitioned schedules (#12601) by @sryza
  • a41fecf - fix loading partitioned source assets in non-partitioned non-asset jobs (#12586) by @sryza
  • c78992f - Update stale reasons to remove UNKNOWN, add MISSING (#12584) by @smackesey
  • 216c9fc - GQL Add dynamic partition mutation (#12562) by @clairelin135
  • 84201da - Add asset provenance access to OpExecutionContext (#12467) by @smackesey
  • 7e3be46 - feat(databricks): add basic op implementations (#12492) by @rexledesma
  • 2ffb21e - docs(databricks): add guide for Databricks integration (#12326) by @rexledesma
  • d170c85 - migrate some tests away from AssetGroup (#12606) by @sryza
  • 1a3019b - divest permissions for editing sensor from updating sensor cursor (#12589) by @dpeng817
  • f532d6e - [dagit] Fix tests pinned to February (#12616) by @bengotow
  • e403d39 - Move tests, fixtures, and storybooks into folders (#12614) by @salazarm
  • 6ac4cff - Add root stale causes for assets (#12619) by @smackesey
  • dab3831 - docs for graphs that depend on assets (#12597) by @sryza
  • 122d46e - [UI] Dynamic partition creation and selection (#12615) by @salazarm
  • e4418ee - [asset-reconciliation][perf] Do not get/set known_used_data in asset reconciliation loop (#12433) by @OwenKephart
  • e4aeeee - [dagster-gcp-pyspark] Add BigQuery PySpark type handler (#12398) by @jamiedemaria
  • b552e39 - Automating pipeline guide (#12547) by @odette-elementl
  • a1b96c7 - Capture db timeout errors for tag key queries (#12596) by @prha
  • e1f9d7f - Get partitions with planned but not completed materializations, the event log query (#12404) by @johannkm
  • 83c875d - Add failures to partition status cache (#12599) by @johannkm
  • ff7d080 - [bigquery-pandas] pipe gcp location through to type handler (#12587) by @jamiedemaria
  • 08e5e0e - [docs, bigquery] tutorial and reference guide (#12452) by @jamiedemaria
  • 5d9e03c - make freshness policies work with graph_asset and graph_multi_asset (#12630) by @sryza
  • f8acfcd - Revert "Add failures to partition status cache (#12599)" by @johannkm
  • 7f4676b - Revert "Get partitions with planned but not completed materializations, the event log query (#12404)" by @johannkm
  • 445a134 - Modify get_first_partition_window to account for offset (#12504) by @clairelin135
  • a1b4eea - [dagit] Switch to staleStatus for stale tags, display causes in tooltips (#12611) by @bengotow
  • 349a0b0 - Change format of staleness root causes and remove old resolvers (#12628) by @smackesey
  • c6e54a6 - Helm: Support valueFrom in user deployments (#12644) by @johannkm
  • 69f751c - Update asset versioning guide (#12659) by @smackesey
  • 6d17799 - update CHANGES.md for 1.1.21 (#12662) by @sryza
  • ed9e77b - docs(databricks): update guide based on feedback (#12669) by @rexledesma
  • c607767 - 1.1.21 by @elementl-devtools
dagster - 1.1.20 (core) / 0.17.20 (libraries)

Published by benpankow over 1 year ago

New

  • The new @graph_asset and @graph_multi_asset decorators make it more ergonomic to define graph-backed assets.

  • Dagster will auto-infer dependency relationships between single-dimensionally partitioned assets and multipartitioned assets, when the single-dimensional partitions definition is a dimension of the MultiPartitionsDefinition.

  • A new Test sensor / Test schedule button that allows you to perform a dry-run of your sensor / schedule. Check out the docs on this functionality here for sensors and here for schedules.

  • [dagit] Added (back) tag autocompletion in the runs filter, now with improved query performance.

  • [dagit] The Dagster libraries and their versions that were used when loading definitions can now be viewed in the actions menu for each code location.

  • New bigquery_pandas_io_manager can store and load Pandas dataframes in BigQuery.

  • [dagster-snowflake, dagster-duckdb] SnowflakeIOManagers and DuckDBIOManagers can now default to loading inputs as a specified type if a type annotation does not exist for the input.

  • [dagster-dbt] Added the ability to use the “state:” selector

  • [dagster-k8s] The Helm chart now supports the full kubernetes env var spec for Dagit and the Daemon. E.g.

    dagit:
      env:
      - name: “FOO”
        valueFrom:
          fieldRef:
            fieldPath: metadata.uid
    

Bugfixes

  • Previously, graphs would fail to resolve an input with a custom type and an input manager key. This has been fixed.
  • Fixes a bug where negative partition counts were displayed in the asset graph.
  • Previously, when an asset sensor did not yield run requests, it returned an empty result. This has been updated to yield a meaningful message.
  • Fix an issue with a non-partitioned asset downstream of a partitioned asset with self-dependencies causing a GQL error in dagit.
  • [dagster-snowflake-pyspark] Fixed a bug where the PySparkTypeHandler was incorrectly loading partitioned data.
  • [dagster-k8s] Fixed an issue where run monitoring sometimes failed to detect that the kubernetes job for a run had stopped, leaving the run hanging.

Documentation

  • Updated contributor docs to reference our new toolchain (ruff, pyright).
  • (experimental) Documentation for the dynamic partitions definition is now added.
  • [dagster-snowflake] The Snowflake I/O Manager reference page now includes information on working with partitioned assets.

All Changes

https://github.com/dagster-io/dagster/compare/1.1.19...1.1.20

  • c488fdb - disable check_same_thread on in-memory sqlite storage (#12229) by @alangenfeld
  • 370093d - [direct invoke] yield implicit Nothing Output (#12309) by @alangenfeld
  • 8a42a96 - Fix multipartitions run length encoding error (#12329) by @clairelin135
  • 268ac07 - [freshness-policies] Allow setting freshness policies when using graph-backed assets (#12357) by @OwenKephart
  • baab234 - Add skip reason to asset sensor (#12343) by @OwenKephart
  • 49fb47f - Fix partitions backfill deserialization error (#12238) by @clairelin135
  • 840fda0 - Move CachingRepositoryData.from_list and from_dict into standalone function (#12321) by @schrockn
  • 9732826 - refactor(databricks): divest from databricks_api in favor of databricks-cli (#12153) by @rexledesma
  • b573daa - nullsafe array index access (#12362) by @salazarm
  • 7962972 - Change schedule button text (#12361) by @dpeng817
  • 17ea06c - [dagit] add full serialized error to graphql errors (#12228) by @alangenfeld
  • 5ebe64a - [dagster-pandas][dagster-pandera] assign a typing_type for generated pandas dataframe DagsterTypes (#12363) by @OwenKephart
  • 5238e6f - [typing/static] Execution API types (#12330) by @smackesey
  • 84aa559 - [refactor] DependencyDefinition renames (#12338) by @smackesey
  • 2779527 - [refactor] execute_step renames (#12354) by @smackesey
  • e81c9b1 - [typing/runtime] Standardize StepInputSource.load_input_object (#12342) by @smackesey
  • 915feb5 - [refactor] Delete NodeInput.solid_name (#12339) by @smackesey
  • 3fe5dcd - [refactor] NodeDefiniton.iterate_solid_defs -> iterate_op_defs (#12336) by @smackesey
  • 85226fd - [refactor] GraphDefinition method renames (#12335) by @smackesey
  • 0d62593 - [2/n][structured config] Enable struct config resources, IO managers to depend on other resources (#11645) by @benpankow
  • c8e4fb2 - [refactor] local var/private arg solid -> node (#12337) by @smackesey
  • ec70f8a - DagsterLibraryRegistry (#12266) by @alangenfeld
  • 47dc694 - [refactor] misc core solid -> node renames (#12368) by @smackesey
  • 15a5c59 - [refactor] dagster._core.definitions.solid_container -> node_container (#12369) by @smackesey
  • ef8da99 - [refactor] Assorted local var solid -> node (#12370) by @smackesey
  • cf847f4 - Fix intermittent dynamic partitions table SQLite concurrency error (#12367) by @clairelin135
  • 7f398b5 - change storage signature for run tags (#12348) by @prha
  • 447f931 - 1.1.19 Changelog (#12378) by @OwenKephart
  • 85c1ac5 - guide to how assets relate to ops and graphs (#12204) by @sryza
  • 39500d8 - [pythonic config] Rename pythonic config classes (#12235) by @benpankow
  • 8783d2f - updates tests to handle new kubernetes resources field (#12395) by @alangenfeld
  • b592861 - [structured config] Migrate resources from project-fully-featured to struct config resources (#11785) by @benpankow
  • fbd6a8f - refactor(databricks): add types to databricks.py (#12364) by @rexledesma
  • cc6ddf9 - refactor(databricks): consolidate types (#12366) by @rexledesma
  • d9f0bda - add dagster_libraries to ListRepositoriesResponse (#12267) by @alangenfeld
  • 18cc0c1 - [graphql] add RepositoryLocation.dagsterLibraryVersions (#12268) by @alangenfeld
  • b31c14f - [dagit] add dagster libraries menu to code location row (#12315) by @alangenfeld
  • 19dac72 - 1.1.19 changelog: reorder code block (#12402) by @yuhan
  • c251806 - refactor(databricks): use databricks_cli's raw api client (#12377) by @rexledesma
  • 532ced5 - [docs] [snowflake] Add partitions to snowflake guide (#12231) by @jamiedemaria
  • cf0779b - Add valid start time check to materialized time partitions subsets (#12403) by @clairelin135
  • 55ec34a - Add api docs for some PartitionsDefinition and PartitionMapping classes (#12365) by @sryza
  • 3fd1174 - Add text to timestamp dropdown (#12379) by @dpeng817
  • 866a100 - [refactor] IExecutionStep.solid_handle -> node_handle (#12371) by @smackesey
  • 82901d3 - [asset-reconciliation] Factor in more run statuses (#12412) by @OwenKephart
  • c132dce - [refactor] *ExecutionContext.solid_config -> op_config (#12372) by @smackesey
  • fa3418e - [refactor] ResolvedRunConfig.solids -> ops (#12373) by @smackesey
  • 5f4cb11 - Automation: versioned docs for 1.1.19 by @elementl-devtools
  • d109e89 - [refactor] assorted Dagstermill renames (#12380) by @smackesey
  • 4ce1f6f - [refactor] Context solid renames (#12374) by @smackesey
  • ded407f - lambda_solid -> solid (#10816) by @smackesey
  • a090efa - [refactor] Assorted pipeline_run -> dagster_run (#12383) by @smackesey
  • a1d5d14 - Make a script to template out new dagster packages (#12389) by @jamiedemaria
  • e660cd8 - [library template] add registry call (#12418) by @alangenfeld
  • 074ae45 - [db io managers] connection refactor (#12258) by @jamiedemaria
  • 4a04e26 - Code location alerting docs (#12411) by @dpeng817
  • be7050e - [refactor] remove @solid decorator (#10952) by @smackesey
  • 9a3a8e2 - [refactor] Delete PipelineRunsFilter (#12384) by @smackesey
  • c39e007 - [db io managers] add default_load_type (#12356) by @jamiedemaria
  • 5eeab19 - [refactor] Delete RunRecord.pipeline_run (#12385) by @smackesey
  • 4ba803c - [refactor] pipeline_run_from_storage -> dagster_run_from_storage (#12386) by @smackesey
  • 1884193 - [Docs RFC] Dynamic Partitions (#12227) by @clairelin135
  • 2e2eca1 - Auto infer multipartition <-> single dimension mapping (#12400) by @clairelin135
  • 79f9ecf - [refactor] execution pipeline_run -> dagster_run (#12388) by @smackesey
  • e1d3579 - [test-api-update] execution_tests/dynamic_tests (#12427) by @smackesey
  • d48a889 - Consider the run worker unhealthy is the job has no active pods but the run is in a non-terminal state (#11510) by @gibsondan
  • 4474da1 - fix: only inspect schema when we may create tables (#12269) by @plaflamme
  • 61ed1c6 - More helpful asset key mismatch errors (#12008) by @benpankow
  • 7536b6f - Add docs for testing schedules/sensors via UI (#12381) by @dpeng817
  • ad6f84f - document missing breaking change in 1.1.19 changelog (#12424) by @sryza
  • 636b58f - BigQuery IO manager (#11425) by @jamiedemaria
  • 8ea58dc - [dagster-gcp-pandas] API docs fix (#12450) by @jamiedemaria
  • 392ee40 - [graphql] launch backfills over assets with different partitionings, if all roots have same partitioning (#11827) by @sryza
  • 1783fad - Fix resolution error with input manager key and custom dagster type (#12449) by @clairelin135
  • bf84d43 - [dagster-dbt] Add ability to use the "state:" selector (#12432) by @OwenKephart
  • 98bc51c - Revert "More helpful asset key mismatch errors (#12008)" (#12459) by @benpankow
  • 2aba792 - [dagit] Add missing React keys to prevent new warning toasts (#12210) by @bengotow
  • 49a07e6 - [CustomConfirmationDialog] Allow overriding the button text (#12444) by @salazarm
  • 8f5f31b - add another todo to create_dagster_package (#12453) by @jamiedemaria
  • 74f70c9 - [dagster-gcp-pandas] register library in init (#12469) by @jamiedemaria
  • 2fc07dc - [bugfix] fix projected logical version resolution for asset downstream of self-dep (#12443) by @smackesey
  • 00a5d89 - fix: resolve correct legacy arguments for emr pyspark step launcher (#12419) by @rexledesma
  • 9116e08 - [dagster-dbt] Add missing files for tests (#12471) by @OwenKephart
  • 27808ac - Add task_role_arn and execution_role_arn to EcsContainerContext (#12358) by @gibsondan
  • 314b968 - dagster-census api docs (#12413) by @yuhan
  • c4f158d - graph_asset and graph_multi_asset decorators (#10152) by @sryza
  • 9fa10cb - [dagster-snowflake-pyspark] fix bug loading partitions (#12472) by @jamiedemaria
  • 736fff5 - Add stale status causes (#11953) by @smackesey
  • 925e596 - Update Contributing doc with instructions for ruff/pyright (#12481) by @smackesey
  • 011e20f - [bigquery] mark bigquery io manager experimental (#12479) by @jamiedemaria
  • 4c74851 - fix (#12477) by @salazarm
  • 2214ad6 - add graphql fields for querying run tags (#12409) by @prha
  • f131a97 - add partial tag autocomplete for run filter input (#12410) by @prha
  • 1e4aa98 - Update GQL to expose StaleStatus and StaleStatusCause (#11952) by @smackesey
  • 6d04e2c - Support env valueFrom in Helm chart (#12425) by @johannkm
  • db78b85 - Fix state status logical version test (#12484) by @smackesey
  • 1f4ccd5 - use opt_nullable_mapping for dagster library versions (#12487) by @alangenfeld
  • c56e843 - fix ruff (#12486) by @alangenfeld
  • f120431 - [pythonic resources] Clean up initialization of env vars, treat resource objects as immutable (#12445) by @benpankow
  • aa3b46e - [structured config] Add support for Selectors w/ pydantic discriminated unions (#11280) by @benpankow
  • 5a0145e - [pythonic resources] Last set of class renames (#12490) by @benpankow
  • fc27360 - Allow setting logical version inside op (#12189) by @smackesey
  • dc0f85a - Add example of customizing task role and execution role arn to the ECS agent docs (#12491) by @gibsondan
  • fc7161b - Remove existing RunConfig class (#12488) by @benpankow
  • 903a297 - [pythonic config] Add structured RunConfig object for specifying runtime, job config (#11965) by @benpankow
  • e46226f - [draft][pythonic config][docs] Introduce intro to Resources doc utilizing Pythonic resources (#12260) by @benpankow
  • c910a6d - [draft][pythonic config][docs] Introduce intro to Config doc utilizing Pythonic config (#12349) by @benpankow
  • f7fa87b - 1.1.20 changelog (#12506) by @benpankow
  • 7eb5a8f - [fix] fix sphinx airflow version parsing (#12507) by @benpankow
  • 8c9f54a - 1.1.20 by @elementl-devtools
dagster - 1.1.19 (core) / 0.17.19 (libraries)

Published by OwenKephart over 1 year ago

New

  • The FreshnessPolicy object now supports a cron_schedule_timezone argument.
  • AssetsDefinition.from_graph now supports a freshness_policies_by_output_name parameter.
  • The @asset_sensor will now display an informative SkipReason when no new materializations have been created since the last sensor tick.
  • AssetsDefinition now has a to_source_asset method, which returns a representation of this asset as a SourceAsset.
  • You can now designate assets as inputs to ops within a graph or graph-based job. E.g.
from dagster import asset, job, op

@asset
def emails_to_send():
    ...

@op
def send_emails(emails) -> None:
    ...

@job
def send_emails_job():
    send_emails(emails_to_send.to_source_asset())
  • Added a --dagit-host/-h argument to the dagster dev command to allow customization of the host where Dagit runs.
  • [dagster-snowflake, dagster-duckdb] Database I/O managers (Snowflake, DuckDB) now support static partitions, multi-partitions, and dynamic partitions.

Bugfixes

  • Previously, if a description was provided for an op that backed a multi-asset, the op’s description would override the descriptions in Dagit for the individual assets. This has been fixed.
  • Sometimes, when applying an input_manager_key to an asset’s input, incorrect resource config could be used when loading that input. This has been fixed.
  • Previously, the backfill page errored when partitions definitions changed for assets that had been backfilled. This has been fixed.
  • When displaying materialized partitions for multipartitioned assets, Dagit would error if a dimension had zero partitions. This has been fixed.
  • [dagster-k8s] Fixed an issue where setting runK8sConfig in the Dagster Helm chart would not pass configuration through to pods launched using the k8s_job_executor.
  • [dagster-k8s] Previously, using the execute_k8s_job op downstream of a dynamic output would result in k8s jobs with duplicate names being created. This has been fixed.
  • [dagster-snowflake] Previously, if the schema for storing outputs didn’t exist, the Snowflake I/O manager would fail. Now it creates the schema.

Breaking Changes

  • Removed the experimental, undocumented asset_key, asset_partitions, and asset_partitions_defs arguments on Out.
  • @multi_asset no longer accepts Out values in the dictionary passed to its outs argument. This was experimental and deprecated. Instead, use AssetOut.
  • The experimental, undocumented top_level_resources argument to the repository decorator has been renamed to _top_level_resources to emphasize that it should not be set manually.

Community Contributions

  • load_asset_values now accepts resource configuration (thanks @Nintorac!)
  • Previously, when using the UPathIOManager, paths with the "." character in them would be incorrectly truncated, which could result in multiple distinct objects being written to the same path. This has been fixed. (Thanks @spenczar!)

Experimental

  • [dagster-dbt] Added documentation to our dbt Cloud integration to cache the loading of software-defined assets from a dbt Cloud job.

Documentation

  • Revamped the introduction to the Partitions concepts page to make it clear that non-time-window partitions are equally encouraged.
  • In Navigation, moved the Partitions and Backfill concept pages to their own section underneath Concepts.
  • Moved the Running Dagster locally guide from Deployment to Guides to reflect that OSS and Cloud users can follow it.
  • Added a new guide covering asset versioning and caching.

All Changes

https://github.com/dagster-io/dagster/compare/1.1.18...1.1.19

  • f5eeb35 - feat(dagster-dbt): support dbt-core 1.4.x (#11902) by @rexledesma
  • 573be92 - Fixes/improvements for pyright script (#12175) by @smackesey
  • d949c6e - nux examples in oss 1/: add dagster_cloud.yaml to prep cloud nux onboarding (#12172) by @yuhan
  • 6b7050f - [dagster-dbt] [1/2] Add streaming entrypoint for dbt cli execution (#12086) by @OwenKephart
  • 3a3d1fc - Fix import (#12185) by @OwenKephart
  • 3979da0 - (tick-testing 3/6) type annotations to schedules gql (#12057) by @dpeng817
  • 5974f1c - (tick-testing 4/6) mutation to dry-run sensor (#11616) by @dpeng817
  • 7dbebdc - [db io managers] support static partitions (#12129) by @jamiedemaria
  • 5265a0b - Pass correct config to io manager when using input_manager_key (#12053) by @jamiedemaria
  • 26e219c - [dagster-dbt] [2/2] Add ability to stream events while executing dbt assets (#12100) by @OwenKephart
  • 1d250c3 - Agent downtime alert docs (#12186) by @johannkm
  • 2abe250 - (tick-testing 5/6) Add SensorType (#12021) by @dpeng817
  • 415c875 - (tick-testing 6/6) Add dry run mutations for schedules (#11869) by @dpeng817
  • c1b1dc4 - [dagit] Add some icons for Cloud (#12187) by @hellendag
  • e0ed9c1 - [1/n][structured config] Add ability to runtime-configure struct-config resources (#11773) by @benpankow
  • 8b69bcd - fix pyright (#12193) by @jamiedemaria
  • 1667301 - [dagit] Storybooks for asset partition and event details + dayJS fix (#12178) by @bengotow
  • 7aecf4a - [dagster-wandb] Integration with Weights & Biases (#10470) by @chrishiste
  • 6ad4a55 - Store the instance ref on the grpc server class if it's present, so that grpc api calls can use it if it's there (#12194) by @gibsondan
  • 1435228 - W&B integration follow up 2/ update example to defs (#12059) by @yuhan
  • ad2c225 - nux examples in oss 2/: add required env vars in README's front matter (#12154) by @yuhan
  • be4677c - Sensor testing UI (#12148) by @salazarm
  • a2cba5f - validate time partition keys when adding to a TimeWindowPartitionsSubset (#12195) by @OwenKephart
  • ad650ea - Make Partition GRPC calls with instance_ref args optional (#12196) by @clairelin135
  • 7150223 - Add public api doc for create_repository_using_definitions_args, update typehints, and cleanup Definitions docstring (#12176) by @schrockn
  • 33709f6 - [dagster-airflow] [docs] migration guide updates/considerations (#12198) by @Ramshackle-Jamathon
  • 3ba9203 - [dagster-airflow] refactor airflow_db resources (#12202) by @Ramshackle-Jamathon
  • b247568 - Fix alert docs (#12208) by @johannkm
  • edfe983 - Temporarily disable pyright in BK (#12212) by @smackesey
  • 2c05ba4 - Rename "Loading multiple repositories" section in workspace files docs (#12201) by @gibsondan
  • 796d49b - Make top_level_resources argument _top_level_resources (#12211) by @schrockn
  • 687ff73 - Revert "nux examples in oss 2/: add required env vars in README's front matter" (#12199) by @yuhan
  • 53de50a - Pyright config/script fixes and improvements (#12206) by @smackesey
  • e61dbe8 - Add guide for observable source assets and versioning (#12118) by @schrockn
  • ac068c1 - 1.1.18 Changelog (#12217) by @smackesey
  • 92048a7 - Fix dagster-wandb placeholder (#12222) by @smackesey
  • 88a15ee - Add "--dagit-host" arg to dagster dev (#12220) by @gibsondan
  • b604a4b - diff against origin/master in quick_pyright (#12200) by @sryza
  • 53683f0 - Re-enable pyright in BK (#12214) by @smackesey
  • 6b1a802 - Remove restart-on-failure from local docker agent guide (#12224) by @gibsondan
  • ce76307 - [dagit] Storybooks for MetadataEntry rendering (#12177) by @bengotow
  • 6087efe - Add airbyte guide to experimental guide index (#12207) by @smackesey
  • 601bd2e - Automation: versioned docs for 1.1.18 by @elementl-devtools
  • 5b80529 - Schedules Testing UI (#12160) by @salazarm
  • 3e6d8b8 - Keep dots in paths in UPathIOManager (#12174) by @spenczar
  • b4e58cb - remove Out.asset_key (#12221) by @sryza
  • 0db1186 - [structured config] Fix env vars not working with direct Resource instantiation (#11468) by @benpankow
  • 5d1d126 - allow using SourceAssets to satisfy node inputs in non-asset jobs (#12091) by @sryza
  • 3111417 - Add observe function (source asset analogue to materialize) (#11996) by @smackesey
  • 538660f - treat pyright warnings as errors for buildkite purposes (#12239) by @gibsondan
  • 983cc4f - Update serverless docs with new run isolation default (#12225) by @johannkm
  • d30a792 - docs(dbt-cloud): add instructions to cache dbt Cloud compilation (#11793) by @rexledesma
  • 6337e12 - Allow dagster api grpc --max-workers (#12246) by @smackesey
  • 60bc7e4 - Allow Ursula to test dynamic partition writes (#12215) by @clairelin135
  • 6c3e30b - [docs] fix dbt tutorial error (#12255) by @jamiedemaria
  • 56d4ae3 - [docs] - Correct name of freshness sensor context object (#12250) by @erinkcochran87
  • ec52a24 - document system tags on metadata/tags page (#12256) by @sryza
  • 4e66e83 - [dagster-airflow] remove ref to cli (#12262) by @Ramshackle-Jamathon
  • e480522 - Revamp intro of partitions concepts page (#12257) by @sryza
  • 06c872d - Remove @experimental from build_assets_job and build_source_asset_observation_job (#12252) by @smackesey
  • 161d7fc - [fix] Surface the correct description for multi-assets when a description is provided for the op (#12271) by @OwenKephart
  • 7f4310d - remove legacy APIs from dagster-k8s tests (#12020) by @sryza
  • d798ff5 - Use Definitions in the body of materialize functions (and other public-facing execution functions) (#12038) by @schrockn
  • 0f2d72f - [typing/static] assets (#12003) by @smackesey
  • 9b58d82 - [typing/runtime] Make GraphDefinition.create_adjacency_lists internally public (#12291) by @smackesey
  • bc9f2fb - [typing/runtime] add AssetsDefinition.partition_mappings (#12290) by @smackesey
  • d2a7fde - in docs navigation, make partitions independent from schedules (#12264) by @sryza
  • 085a274 - [typing/static] storage (#12283) by @smackesey
  • 24b6c16 - [api docs] Show PartitionsDefinition doc strings on docs site (#12265) by @jamiedemaria
  • ac1510f - [db io manager] Support MultiPartitions (#12165) by @jamiedemaria
  • 4648e41 - load asset value config (#10991) by @Nintorac
  • 60fde53 - remove some handling for solid in composition.py (#12278) by @sryza
  • 5e36223 - Add shortcuts for '--dagit-host' and '--dagit-port' to dagster dev (#12301) by @gibsondan
  • 9bb417a - [typing/runtime] [gql] eliminate resolver **kwargs (2) (#12241) by @smackesey
  • 8a8ad96 - [typing/runtime] [gql] eliminate resolver **kwargs (4) (#12243) by @smackesey
  • 7938dd6 - [typing/runtime] [gql] eliminate resolver **kwargs (5) (#12244) by @smackesey
  • a830240 - [Runs Table] Open link in new tab (#12307) by @salazarm
  • af4f84d - [typing/runtime] [gql] eliminate resolver **kwargs (3) (#12242) by @smackesey
  • 5dc1bd2 - add types and break up big functions in dagstermill impl (#12274) by @sryza
  • 8af0a43 - [typing/runtime] [gql] eliminate resolver **kwargs (1) (#11724) by @smackesey
  • 977da88 - add cloud to scaffold (#12306) by @slopp
  • 0750ac1 - [typing/runtime] Remove unnecessary isinstance checks (#12287) by @smackesey
  • 232a7ec - [typing/runtime] Make AssetKey.to_string always return AssetKey (#12285) by @smackesey
  • f50333a - [typing/runtime] Refactor cached_method (#12004) by @smackesey
  • b7a4647 - Remove mypy from tox files (#12112) by @smackesey
  • d425ba1 - [typing/runtime] Asset graph comparisons (#12286) by @smackesey
  • 5d763af - refactor(databricks): remove legacy Dagster definitions (#12150) by @rexledesma
  • 9661a54 - docs: update scaffold to use dagster dev (#12297) by @rexledesma
  • 244d949 - [db io managers] create schema if not exist (#11764) by @jamiedemaria
  • f2e2220 - AssetsDefinition.to_source_asset (#12203) by @sryza
  • 6d6c8c6 - in tags doc, fix schedule tag and add sensor tag (#12313) by @sryza
  • 97b108d - Simplify buildkite-build-test-project-image (#12029) by @gibsondan
  • da0af86 - Move repository_definition into subpackage (#12318) by @schrockn
  • 2da80d6 - Break up repository_definition subpackage into multiple files (#12319) by @schrockn
  • 9de4608 - [docs] - Build on local Dagster guide (#12168) by @erinkcochran87
  • a5df78d - [freshness-policies] add a cron_schedule_timezone argument to the FreshnessPolicy class (#12263) by @OwenKephart
  • 4929d2f - [typing/runtime] Refactor subset selection tree (#12289) by @smackesey
  • f0917e4 - Fix unimported symbol BK (#12327) by @smackesey
  • b7049ce - nux examples in oss 3/: sync dagster-io/quickstart-*/setup.py to example/* (#12234) by @yuhan
  • 92a3647 - [dagster-k8s] In the execute_k8s_job op, use step key to generate the k8s job name (#12344) by @OwenKephart
  • e318f3d - Include instance-level / code-location-level runK8sConfig in step pods (#12308) by @gibsondan
  • d703502 - remove unused lazy-repository docs snippet (#12277) by @sryza
  • 00a692a - [asset-reconciliation][perf] Perf regression tests (#12230) by @OwenKephart
  • 4075281 - s/_CacheingDefinitionIndex/CacheingDefinitionIndex/g (#12320) by @schrockn
  • 39f7c10 - [typing/runtime] Massage set flattening (#12288) by @smackesey
  • c350247 - [asset-reconciliation][perf] Better caching of most recent materializations (#12237) by @OwenKephart
  • 1fa7a97 - [db io managers] Dynamic Partition tests (#12216) by @jamiedemaria
  • 94d89b2 - bump limit (#12355) by @OwenKephart
  • 8b60154 - Fix multipartitions run length encoding error (#12329) by @clairelin135
  • 3a81d7f - [freshness-policies] Allow setting freshness policies when using graph-backed assets (#12357) by @OwenKephart
  • eb907dd - nullsafe array index access (#12362) by @salazarm
  • 2fc5a74 - Change schedule button text (#12361) by @dpeng817
  • 43baeee - [dagster-pandas][dagster-pandera] assign a typing_type for generated pandas dataframe DagsterTypes (#12363) by @OwenKephart
  • 31fa364 - Fix partitions backfill deserialization error (#12238) by @clairelin135
  • c1ab9a3 - 1.1.19 Changelog (#12378) by @OwenKephart
  • 3af8cf3 - guide to how assets relate to ops and graphs (#12204) by @sryza
  • a5dfd5e - updates tests to handle new kubernetes resources field (#12395) by @alangenfeld
  • e1d4cbd - 1.1.19 changelog: reorder code block (#12402) by @yuhan
  • 8bf470b - 1.1.19 by @elementl-devtools
dagster - 1.1.18 (core) / 0.17.18 (libraries)

Published by jmsanders over 1 year ago

New

  • Assets with time-window PartitionsDefinitions (e.g. HourlyPartitionsDefinition, DailyPartitionsDefinition) may now have a FreshnessPolicy.
  • [dagster-dbt] When using load_assets_from_dbt_project or load_assets_from_dbt_manifest with dbt-core>=1.4, AssetMaterialization events will be emitted as the dbt command executes, rather than waiting for dbt to complete before emitting events.
  • [dagster-aws] When run monitoring detects that a run unexpectedly crashed or failed to start, an error message in the run’s event log will include log messages from the ECS task for that run to help diagnose the cause of the failure.
  • [dagster-airflow] added make_ephemeral_airflow_db_resource which returns a ResourceDefinition for a local only airflow database for use in migrated airflow DAGs
  • Made some performance improvements for job run queries which can be applied by running dagster instance migrate.
  • [dagit] System tags (code + logical versions) are now shown in the asset sidebar and on the asset details page.
  • [dagit] Source assets that have never been observed are presented more clearly on the asset graph.
  • [dagit] The number of materialized and missing partitions are shown on the asset graph and in the asset catalog for partitioned assets.
  • [dagit] Databricks-backed assets are now shown on the asset graph with a small “Databricks” logo.

Bugfixes

  • Fixed a bug where materializations of part of the asset graph did not construct required resource keys correctly.
  • Fixed an issue where observable_source_asset incorrectly required its function to have a context argument.
  • Fixed an issue with serialization of freshness policies, which affected cacheable assets that included these policies such as those from dagster-airbyte
  • [dagster-dbt] Previously, the dagster-dbt integration was incompatible with dbt-core>=1.4. This has been fixed.
  • [dagster-dbt] load_assets_from_dbt_cloud_job will now avoid unnecessarily generating docs when compiling a manifest for the job. Compile runs will no longer be kicked off for jobs not managed by this integration.
  • Previously for multipartitioned assets, context.asset_partition_key returned a string instead of a MultiPartitionKey. This has been fixed.
  • [dagster-k8s] Fixed an issue where pods launched by the k8s_job_executor would sometimes unexpectedly fail due to transient 401 errors in certain kubernetes clusters.
  • Fix a bug with nth-weekday-of-the-month handling in cron schedules.

Breaking Changes

  • [dagster-airflow] load_assets_from_airflow_dag no longer creates airflow db resource definitions, as a user you will need to provide them on Definitions directly

Deprecations

  • The partitions_fn argument of the DynamicPartitionsDefinition class is now deprecated and will be removed in 2.0.0.

Community Contributions

  • [dagster-wandb] A new integration with Weights & Biases allows you to orchestrate your MLOps pipelines and maintain ML assets with Dagster.
  • Postgres has been updated to 14.6 for Dagster’s helm chart. Thanks @DustyShap!
  • Typo fixed in docs. Thanks @C0DK!
  • You can now pass a callable directly to asset (rather than using @asset in decorator form) to create an asset. Thanks @ns-finkelstein!

Documentation

  • New “Asset versioning and caching” guide
  • [dagster-snowflake] The Snowflake guide has been updated to include PySpark dataframes
  • [dagster-snowflake] The Snowflake guide has been updated to include private key authentication
  • [dagster-airflow] The Airflow migration guide has been update to include more detailed instructions and considerations for making a migration

All Changes

https://github.com/dagster-io/dagster/compare/1.1.17...1.1.18

  • d6c9255 - Add google analytics tracking query param to slack link from OSS (#12024) by @salazarm
  • 08771c5 - Add --working-directory argument to dagster dev (#12026) by @gibsondan
  • 257fe9b - [dagster-fivetran] Add option to force-create materializations for tables not in API response (#11972) by @benpankow
  • badf42e - docs: removed typo from install.mdx (#12022) by @clayheaton
  • 79dba3f - [dagit] Fix use of fragments causing Apollo caching error in partition health (#12030) by @bengotow
  • 58d3ed6 - [dagster-airflow] use full timestamp for partition name (#12034) by @Ramshackle-Jamathon
  • a8dce03 - [dagster-airflow] airflow retry support (#11954) by @Ramshackle-Jamathon
  • dea08a3 - Change endpoints to the ones that are used by airbyte UI (#12012) by @emilija-omnisend
  • a7dda11 - lint fix (#12044) by @benpankow
  • 32e3776 - feat(dbt-cloud): compile run only if job has environment variable cache (#12042) by @rexledesma
  • fe59ca7 - fix(dbt-cloud): inherit generate docs settings for compile run (#12043) by @rexledesma
  • 303ff04 - skip import test on windows (#12048) by @alangenfeld
  • 7523cbb - 1.1.15 changelog (#12051) by @jamiedemaria
  • 4529f46 - [pyright] [scripts] misc (#11923) by @smackesey
  • 35321cd - accept UndefinedAssetJob in run status sensors (#12054) by @jamiedemaria
  • fd18525 - Enable use of arguments when using the asset function directly instead of as a decorator (#11903) by @nsfinkelstein
  • a920b29 - Automation: versioned docs for 1.1.15 by @elementl-devtools
  • 74eac38 - Bump Postgres to 14.6 (#12015) by @DustyShap
  • 389b47b - make black (#12078) by @alangenfeld
  • 71db85d - Construct required resources correctly when materializing partial asset graphs (#12052) by @jamiedemaria
  • 3799c1a - Fix context.asset_partition_key for multidimensional partitions (#12035) by @clairelin135
  • 2e9e165 - Install dagster-managed-elements when appropriate (#12046) by @jmsanders
  • 08b2a0b - Bump http-cache-semantics from 4.1.0 to 4.1.1 in /js_modules/dagit (#12080) by @dependabot[bot]
  • 8d591c8 - [docs] [dagster-snowflake] update docs with private key auth info (#11746) by @jamiedemaria
  • 4b6c9f5 - in comment, clarify that scraping asset info off of In/Out is legacy (#12089) by @sryza
  • 416bfca - Lazy load ExternalAssetGraph for CachingStaleStatusResolver (#12090) by @smackesey
  • 5533d0f - [telemetry] fix setting __TELEMETRY_ENABLED__ flag on Dagit (#12092) by @benpankow
  • 8d95149 - [pyright] misc (#12102) by @smackesey
  • 7741d8e - Allow observable source assets to have no context argument (#11981) by @schrockn
  • 0a316c0 - [dagster-airflow] remove api's for 1.x.x release (#12023) by @Ramshackle-Jamathon
  • 2c96953 - [dagster-airflow] extract utils into seperate file (#12065) by @Ramshackle-Jamathon
  • a78b479 - [dagster-airflow] move schedule and asset functions into their own files (#12066) by @Ramshackle-Jamathon
  • cae5899 - [dagster-airflow] remove vended airflow code (#12121) by @Ramshackle-Jamathon
  • 2ee7227 - Fix pre-release core => library translation (#12109) by @gibsondan
  • 8d142c4 - add run job index (#12033) by @prha
  • 317ed39 - Add Databricks compute kind tag (#12117) by @braunjj
  • 8d637c9 - [schedules] fix nth weekday of month cron handling (#12130) by @alangenfeld
  • ccea121 - [dagit] Add tests for unpartitioned case, fix for console.error in partition health (#12085) by @bengotow
  • 30dccea - [js] fix lint (#12131) by @alangenfeld
  • 00b4a8e - [pythonic config] Add disabled tests for various more complex config schemas (#12105) by @benpankow
  • fdf5da4 - [docs] Clairfy dagster-pagerduty docs (#12128) by @benpankow
  • 8f374c2 - Test with a 64 bit after_cursor (#12093) by @jmsanders
  • 6cdcb26 - [dagit] Memoize some Intl behavior (#12125) by @hellendag
  • da50274 - [dagit] Add storybook coverage of Asset Table states (#12103) by @bengotow
  • 951fb17 - [dagit] Add Storybook coverage of PartitionHealthSummary rendering (#12116) by @bengotow
  • ec94a93 - [dagit] In development, render toasts for unhandled promise exceptions (#12082) by @bengotow
  • a0a7385 - Include logs in failure message for ECS monitoring failures (#12113) by @gibsondan
  • 15845e8 - [dagster-airflow] 1.x.x api changes (#12067) by @Ramshackle-Jamathon
  • aa22376 - [dagster-airflow] migration limitations (#12124) by @Ramshackle-Jamathon
  • 88ed4f4 - [dagit] Mocks, storybooks and a test for BackfillTable (#12095) by @bengotow
  • a528a10 - [pythonic config] Correctly handle pydantic List types in config (#12106) by @benpankow
  • 22fab35 - comment to explain what define_solid_dictionary_cls does (#12088) by @sryza
  • 2595449 - remove protobuf pin in dagster[test] (#11974) by @yuhan
  • 2335fe8 - [dagster-airflow] drop unique_id param (#12127) by @Ramshackle-Jamathon
  • e0d1588 - Test with a 64 bit after_cursor (#12137) by @jmsanders
  • fd4a9ea - [dagit] Show partition status on the asset graph, catalog pages, add tests (#11914) by @bengotow
  • e26c3d7 - [pythonic config] Correctly handle pydantic Dict/Mapping types in config (#12107) by @benpankow
  • 5e15e0b - release 1.1.17 changelog (#12140) by @Ramshackle-Jamathon
  • 691aaf8 - [docs] Add PySpark to Snowflake reference guide (#11814) by @jamiedemaria
  • a7e764c - Add bugfixes to 1.1.17 changelog (#12141) by @gibsondan
  • 74462a9 - Automation: versioned docs for 1.1.17 by @elementl-devtools
  • 16a2b7a - Ruff 0.0.212 -> 0.0.241 (#12138) by @smackesey
  • fb8e118 - Adopt pyright for typechecking (#10983) by @smackesey
  • dd86dc7 - [1/n] Serialize top-level resources into repository data (#11529) by @benpankow
  • bda3473 - [2/n] Expose top-level resources via GraphQL (#11553) by @benpankow
  • dc2e313 - Fix serdes on FreshnessPolicy object (#12143) by @benpankow
  • f325f61 - Better error output for Pyright script (#12145) by @smackesey
  • 0dfaf13 - [3/n] Display top-level resources in Dagit sidebar (#11554) by @benpankow
  • b7bb255 - Fix misc type errors (#12147) by @smackesey
  • 47a0ec2 - Delete no-longer-needed type-ignores (#12156) by @smackesey
  • 03bcb4e - [dagit] Fix build failure caused by mocks out of sync with query (#12152) by @bengotow
  • 13e6ea2 - Support json output from pyright script and ignore build (#12155) by @smackesey
  • c358b3d - Add pyright_rebuild Makefile target (#12159) by @smackesey
  • 392c71d - [pyright] --find-links for gprcio wheels (#12158) by @alangenfeld
  • 076afe6 - [docs] - Remove Pagination component from layout (#12135) by @erinkcochran87
  • 5f0e3dc - Fix instructions for using serverless from the CLI without GitHub (#12161) by @shalabhc
  • ab04843 - making erin the owner of all docs 👑 (#12139) by @tacastillo
  • f83c3c4 - [dagit] Better Asset DAG states for source assets (#12142) by @bengotow
  • eabebf8 - [gql] Add tags to AssetEventMixin (#11973) by @smackesey
  • 1528921 - type annotations for input_bindings in composition.py (#12087) by @sryza
  • d4d4fd0 - [Dynamic Partitions 1] Storage changes (#11994) by @clairelin135
  • 357006f - [dagit] System tags (code + logical versions) on asset graph + details (#12151) by @bengotow
  • 2a49a54 - [Dynamic Partitions 2] Update DynamicPartitionsDefinition to have name param (#12000) by @clairelin135
  • 1ca1a97 - hide download links until logs are available (#12164) by @prha
  • f5d7597 - (tick-testing 1/6) Rename GrapheneFutureInstigationTick to GrapheneDryRunInstigationTick (#12055) by @dpeng817
  • bb1f954 - [Dynamic Partitions 3] Display dynamic partitions in Dagit (#11900) by @clairelin135
  • 0864f7b - Avoid creating repo on each partitionStats gql call (#12166) by @clairelin135
  • b21b4e6 - [Docs] Removed sneaky citation mark in title on docs (#12120) by @C0DK
  • 667487f - Add 401 to the list of API codes that our k8s client retries on (#12074) by @gibsondan
  • 8468acd - [freshness-policies] Add the ability to calculate the used_data_time of a TimeWindowPartitioned asset (#11607) by @OwenKephart
  • 6c92dde - (tick-testing 2/6) Revamp dry run tick behavior (#12056) by @dpeng817
  • f634aa3 - feat(dagster-dbt): support dbt-core 1.4.x (#11902) by @rexledesma
  • d83f944 - [dagster-dbt] [1/2] Add streaming entrypoint for dbt cli execution (#12086) by @OwenKephart
  • 5288311 - [dagster-dbt] [2/2] Add ability to stream events while executing dbt assets (#12100) by @OwenKephart
  • c312147 - Fix import (#12185) by @OwenKephart
  • 4ee7a2c - Store the instance ref on the grpc server class if it's present, so that grpc api calls can use it if it's there (#12194) by @gibsondan
  • 1390392 - [dagster-wandb] Integration with Weights & Biases (#10470) by @chrishiste
  • f0a7a2e - W&B integration follow up 2/ update example to defs (#12059) by @yuhan
  • 7af4298 - validate time partition keys when adding to a TimeWindowPartitionsSubset (#12195) by @OwenKephart
  • dad164d - Make Partition GRPC calls with instance_ref args optional (#12196) by @clairelin135
  • d78c354 - [dagster-airflow] [docs] migration guide updates/considerations (#12198) by @Ramshackle-Jamathon
  • 83e946c - [dagster-airflow] refactor airflow_db resources (#12202) by @Ramshackle-Jamathon
  • da0b63b - Make top_level_resources argument _top_level_resources (#12211) by @schrockn
  • a90e5ab - Add guide for observable source assets and versioning (#12118) by @schrockn
  • 765ebf7 - 1.1.18 Changelog (#12217) by @smackesey
  • 52af6d8 - Fix dagster-wandb placeholder (#12222) by @smackesey
  • efae55e - 1.1.18 by @elementl-devtools
dagster - # 1.1.17 (core) / 0.17.17 (libraries)

Published by gibsondan over 1 year ago

New

  • The dagster-airflow library as been moved to 1.x.x to denote the stability of its api's going forward.
  • [dagster-airflow] make_schedules_and_jobs_from_airflow_dag_bag has been added to allow for more fine grained composition of your transformed airflow DAGs into Dagster.
  • [dagster-airflow] Airflow dag task retries and retry_delay configuration are now converted to op RetryPolicies with all make_dagster_* apis.

Bugfixes

  • Fixed an issue where cron schedules using a form like 0 5 * * mon#1 to execute on a certain day of the week each month executed every week instead.
  • [dagit] Fixed an issue where the asset lineage page sometimes timed out while loading large asset graphs.
  • Fixed an issue where the partitions page sometimes failed to load for partitioned asset jobs.

Breaking Changes

  • [dagster-airflow] The use_airflow_template_context, mock_xcom and use_ephemeral_airflow_db params have been dropped, by default all make_dagster_* apis now use a run-scoped airflow db, similiar to how use_ephemeral_airflow_db worked.
  • [dagster-airflow] make_airflow_dag has been removed.
  • [dagster-airflow] make_airflow_dag_for_operator has been removed.
  • [dagster-airflow] make_airflow_dag_containerized has been removed.
  • [dagster-airflow] airflow_operator_to_op has been removed.
  • [dagster-airflow] make_dagster_repo_from_airflow_dags_path has been removed.
  • [dagster-airflow] make_dagster_repo_from_airflow_dag_bag has been removed.
  • [dagster-airflow] make_dagster_repo_from_airflow_example_dags has been removed.
  • [dagster-airflow] The naming convention for ops generated from airflow tasks has been changed to ${dag_id}__${task_id} from airflow_${task_id}_${unique_int}.
  • [dagster-airflow] The naming convention for jobs generated from airflow dags has been changed to ${dag_id} from airflow_${dag_id}.
dagster - 1.1.15 (core) / 0.17.15 (libraries)

Published by shalabhc over 1 year ago

New

  • Definitions now accepts Executor instances in its executor argument, not just ExecutorDefinitions.
  • @multi_asset_sensor now accepts a request_assets parameter, which allows it to directly request that assets be materialized, instead of requesting a run of a job.
  • Improved the performance of instantiating a Definitions when using large numbers of assets or many asset jobs.
  • The job passed to build_schedule_from_partitioned_job no longer needs to have a partitions_def directly assigned to it. Instead, Dagster will infer from the partitions from the assets it targets.
  • OpExecutionContext.asset_partition_keys_for_output no longer requires an argument to specify the default output.
  • The “Reload all” button on the Code Locations page in Dagit will now detect changes to a pyproject.toml file that were made while Dagit was running. Previously, Dagit needed to be restarted in order for such changes to be shown.
  • get_run_record_by_id has been added to DagsterInstance to provide easier access to RunRecord objects which expose the start_time and end_time of the run.
  • [dagit] In the “Materialize” modal, you can now choose to pass a range of asset partitions to a single run rather than launching a backfill.
  • [dagster-docker] Added a docker_container_op op and execute_docker_container_op helper function for running ops that launch arbitrary Docker containers. See the docs for more information.
  • [dagster-snowflake-pyspark] The Snowflake I/O manager now supports PySpark DataFrames.
  • [dagster-k8s] The Docker images include in the Dagster Helm chart are now built on the most recently released python:3.x-slim base image.

Bugfixes

  • Previously, the build_asset_reconciliation_sensor could time out when evaluating ticks over large selections of assets, or assets with many partitions. A series of performance improvements should make this much less likely.
  • Fixed a bug that caused a failure when using run_request_for_partition in a sensor that targeted multiple jobs created via define_asset_job.
  • The cost of importing dagster has been reduced.
  • Issues preventing “re-execute from failure” from working correctly with dynamic graphs have been fixed.
  • [dagit] In Firefox, Dagit no longer truncates text unnecessarily in some cases.
  • [dagit] Dagit’s asset graph now allows you to click “Materialize” without rendering the graph if you have too many assets to display.
  • [dagit] Fixed a bug that stopped the backfill page from loading when assets that had previously been backfilled no longer had a PartitionsDefinition.
  • [dagster-k8s] Fixed an issue where k8s_job_op raised an Exception when running pods with multiple containers.
  • [dagster-airbyte] Loosened credentials masking for Airbyte managed ingestion, fixing the Hubspot source, thanks @joel-olazagasti!
  • [dagster-airbyte] When using managed ingestion, Airbyte now pulls all source types available to the instance rather than the workspace, thanks @emilija-omnisend!
  • [dagster-airbyte] Fixed an issue which arose when attaching freshness policies to Airbyte assets and using the multiprocessing executor.
  • [dagster-fivetran] Added the ability to force assets to be output for all specified Fivetran tables during a sync in the case that a sync’s API outputs are missing one or more tables.

Breaking Changes

  • The asset_keys and asset_selection parameters of the experimental @multi_asset_sensor decorator have been replaced with a monitored_assets parameter. This helps disambiguate them from the new request_assets parameter.

Community Contributions

  • A broken docs link in snowflake_quickstart has been fixed, thanks @clayheaton!
  • Troubleshooting help added to helm deployment guide, thanks @adam-bloom!
  • StaticPartitionMapping is now serializable, thanks @AlexanderVR!
  • [dagster-fivetran] build_fivetran_assets now supports group_name , thanks @toddy86!
  • [dagster-azure] AzureBlobComputeManager now supports authentication via DefaultAzureCredential, thanks @mpicard!

Experimental

  • [dagster-airflow] added a new api load_assets_from_airflow_dag that creates graph-backed, partitioned, assets based on the provided Airflow DAG.

All Changes

https://github.com/dagster-io/dagster/compare/1.1.14...1.1.15

  • 943cbf0 - Make StaticPartitionMapping serializable. Add autodoc. (#11738) by @AlexanderVR
  • 57ebe59 - Performance improvements for large multi assets (#11782) by @OwenKephart
  • d446e7e - monitored_assets param for multi-asset sensor (#11567) by @sryza
  • 15baf61 - [fix] Fix Snowflake IO manager tests for project_fully_featured (#11781) by @benpankow
  • 41dd55d - Update docs to include wheel workarounds for m1 macs (#11777) by @gibsondan
  • 6b9f3bb - fix run_request_for_partition for sensors that target multiple unreso… (#11780) by @sryza
  • 91f8087 - New "running dagster locally" deployment guide that walks through dagster dev usage (#11741) by @gibsondan
  • 98c0bb6 - Restore grpcio pin for python 3.10 (#11784) by @gibsondan
  • 47cacd0 - Changelog 1.1.11 (#11789) by @dpeng817
  • 6190d23 - docs: add clarification for helm migration guide (#10454) by @adam-bloom
  • b0dcaa3 - [dagit] Bump react-scripts to remove codegen plugin (#11769) by @hellendag
  • 986310a - feat(dagster-dbt): use cached run id to fetch artifacts (#11744) by @rexledesma
  • 0bca404 - feat(dbt-cloud): raise exception if cached compile has not completed (#11792) by @rexledesma
  • a56cae8 - changelog 1.1.11 pt 2 (#11800) by @alangenfeld
  • 47062d0 - Heartbeat once at the beginning of every interval daemon (#11802) by @gibsondan
  • ffb0eda - 1.1.11 changelog: fix link (#11804) by @yuhan
  • d88bbb2 - [dagit] Add missing changelog entries for 1.1.11 (#11803) by @bengotow
  • b0b34f3 - add request_assets param to multi_asset_sensor (#11786) by @sryza
  • 84e343c - Automation: versioned docs for 1.1.11 by @elementl-devtools
  • 8d2ad8f - Log to daemon output when running the schedule code fails or times out (#11805) by @gibsondan
  • 6a9ad23 - 1.1.12 changelog (#11807) by @alangenfeld
  • bded418 - [dagit] When viewing an asset that is a root, do not fetch root used data (#11806) by @bengotow
  • be88042 - [dagit] Fix support for long descriptions in the asset catalog table (#11810) by @bengotow
  • 4b436e0 - Convert accesses using '__ASSET_JOB' to instead use the supported implicit job methods (#11348) by @schrockn
  • 754e4aa - [dagster-azure] Add support for DefaultAzureCredential for AzureBlobComputeLogManager (#11310) by @mpicard
  • b1f6921 - add dagster.yaml tests for nux key (#11820) by @gibsondan
  • 91aff13 - Fix 'dagster dev' command with workspace.yaml passed in (#11819) by @gibsondan
  • 86ffe2c - When can't deserialize asset backfill because asset is no longer partitioned, return it as empty (#11812) by @sryza
  • 2a5c55d - 1.1.13 changelog (#11818) by @alangenfeld
  • b6b0649 - add after_cursor to get_materialization_counts_by_partition (#11759) by @OwenKephart
  • daa8269 - Adds group_label to build_fivetran_assets (#11718) by @toddy86
  • 3400feb - default argument to asset_partition_keys_for_output (#11811) by @sryza
  • 0b08b9e - Specify credentials masking. (#11813) by @joel-olazagasti
  • 1c5ddd1 - Fix multidimensional partition backfills (#11788) by @clairelin135
  • fc96541 - Automation: versioned docs for 1.1.13 by @yuhan
  • da9a774 - Move pyproject.toml => origin generation to a WorkspaceLoadTarget class so that it can be reloaded in dagit (#11821) by @gibsondan
  • 932a5ba - fix tz in time_windows_for_partition_keys (#11825) by @sryza
  • bd66da3 - dynamic re-execution fixes (#11581) by @alangenfeld
  • 3d43e2c - [instance] get_run_record_by_id (#11643) by @alangenfeld
  • bcc8694 - [asset-graph] keep track of source asset partitions defs (#11840) by @OwenKephart
  • a0ec4a8 - [dagster-airbyte] Fix issue using freshness policy w/ Airbyte + multiprocessing executor (#11837) by @benpankow
  • cf7bdf9 - [asset-reconciliation] pre-fetch the results of some queries (#11770) by @OwenKephart
  • 273368d - [asset-reconciliation] Fix case where the partitions definition does not have partitions for some subset of the past 24 hours (#11842) by @OwenKephart
  • b1b88ea - Fix for backoff logic incorrectly storing state (#11848) by @gibsondan
  • 4781d8a - use an AssetGraph to resolve asset selections in asset jobs (#11846) by @sryza
  • 3e4befb - fix(dbt-cloud): remove dbt selector when materializing subset (#11843) by @rexledesma
  • 7a0b7cc - pin sqlalchemy below 2.0.0 (#11871) by @benpankow
  • db68ccf - [dagit] Allow materializing “All” without requiring large graphs are rendered (#11854) by @bengotow
  • 4e59844 - chore(dagster-dbt): pin dbt-core<1.4.0 (#11870) by @rexledesma
  • 26ca189 - stop using python semver parsing for mysql versions (#11868) by @prha
  • 69f5464 - Fix aws ssm tests (#11886) by @gibsondan
  • 28918f8 - unbreak backcompat tests after sqlalchemy upgrade (#11885) by @gibsondan
  • b49679c - [dagit] Fix jumpy code location status spinner (#11801) by @hellendag
  • 144f415 - Add apache-airflow test pin to <2.5.1 (#11888) by @gibsondan
  • 0f4d495 - Fix databricks tests running in release branch (#11887) by @gibsondan
  • 645d12a - Add pin for jupyter-client<8 (#11901) by @gibsondan
  • 7dcc367 - [pyright] [core] storage (#11363) by @smackesey
  • 16ba05b - [pyright] [core] standardize is_context_provided (#11364) by @smackesey
  • e972b08 - [pyright] [core] eliminate funcsigs (#11365) by @smackesey
  • e6471fa - 1.1.14 changelog (#11907) by @gibsondan
  • 16ce5d9 - [pyright] [core] remove builitins star import (#11366) by @smackesey
  • 3953b07 - [readme] update twitter badge (#11892) by @alangenfeld
  • 0eb0b4a - [pyright] [gql] DagsterPipelineRunMetadataValue -> DagsterRunMetadataValue (#11893) by @smackesey
  • 54f573f - [pyright] [core] _core/definitions/asset_reconciliation_sensor (#11717) by @smackesey
  • 751a068 - [pyright] [gql] implementation/events (#11894) by @smackesey
  • 4bb58b6 - [pyright] [gql] implementation/fetch_partition_sets (#11895) by @smackesey
  • 9245694 - Automation: versioned docs for 1.1.14 by @elementl-devtools
  • b31c4c3 - [pyright] examples/tutorial-notebook-assets (#11910) by @smackesey
  • a3db3e0 - [pyright] examples/with-airflow (#11911) by @smackesey
  • d6e8832 - [pyright] [dagit] misc (#11727) by @smackesey
  • 8df7a59 - [pyright] [dagster-dask] misc (#11944) by @smackesey
  • 9a90c52 - [pyright] [dagster-databricks] misc (#11943) by @smackesey
  • f9ea6ce - [pyright] [dagster-test] misc (#11940) by @smackesey
  • e548ea9 - [pyright] [dagster-mysql] misc (#11939) by @smackesey
  • 8b463e8 - [pyright] [dagster-duckdb-pyspark] misc (#11937) by @smackesey
  • 173bd27 - [pyright] [dagster-pandas] misc (#11931) by @smackesey
  • 8eafe06 - [pyright] [dagster-postgres] misc (#11929) by @smackesey
  • 54f4778 - [pyright] [dagster-snowflake] misc (#11926) by @smackesey
  • 77ff219 - [pyright] [examples/assets-dbt-python] misc (#11922) by @smackesey
  • 153a360 - [pyright] [dagster-snowflake] misc (#11928) by @smackesey
  • ae6e9df - [pyright] [dagster-pandera] misc (#11930) by @smackesey
  • 4a1705d - [pyright] [dagster-aws] misc (#11730) by @smackesey
  • 403b525 - [pyright] [dagster-airflow] misc (#11729) by @smackesey
  • e8b4297 - [pyright] examples/quickstart-gcp (#11909) by @smackesey
  • 9c133f6 - [pyright] [dagster-dbt] misc (#11941) by @smackesey
  • cb0bccc - [pyright] [automation] misc (#11726) by @smackesey
  • ddc628a - [pyright] [dagster-msteams] misc (#11932) by @smackesey
  • fcf0fb3 - [pyright] [dagster-snowflake-pandas] misc (#11927) by @smackesey
  • 06dae07 - [pyright] [dagstermill] misc (#11925) by @smackesey
  • bce900c - [pyright] [dagster-docker] misc (#11938) by @smackesey
  • 8f8bdcd - [pyright] [dagster-gcp] misc (#11936) by @smackesey
  • 36472a4 - remove need to provide partitions_defto asset job targeted bybuild_schedule_from_partitioned_job (#11844) by @sryza
  • 14ebb3d - fix GQL snapshot tests on py310 (#11945) by @smackesey
  • e625673 - [pyright] [dagster-airbyte] misc (#11728) by @smackesey
  • 4220144 - [pyright] [dagster-mlflow] misc (#11948) by @smackesey
  • 72a67af - Fix k8s_job_op with multiple containers (#11916) by @gibsondan
  • be28a55 - [pyright] [gql] capture_error sig fix (#11722) by @smackesey
  • a6023fa - [pyright] [dagster-azure] misc (#11946) by @smackesey
  • 9db20ca - Add code versions to AssetGraph (#11950) by @smackesey
  • 17cd83c - [pyright] [gql] schema/solids#resolve_required_resources (#11898) by @smackesey
  • d1aa827 - refactor GrpcServerProcess (#11960) by @smackesey
  • 6795b17 - Fix s3_resource docstring example (#11757) by @dpeng817
  • 1ae4d02 - [pyright] [core] create RunGroupInfo type (#11368) by @smackesey
  • 0732cc4 - snowflake connector pin (#11970) by @alangenfeld
  • b4ba5a3 - [pyright] [gql] schema/roots/mutation (#11897) by @smackesey
  • 99cf75d - [easy] List dagster_world.mp4 in static files (#11961) by @salazarm
  • 6e9d06e - [pyright] [gql] AssetKey cleanup (#11721) by @smackesey
  • 7d4154f - CachingProjectedLogicalVersionResolver -> CachingStaleStatusResolver (#11951) by @smackesey
  • 77711ce - whitelist EventLogRecord for serdes (#11978) by @OwenKephart
  • 0360777 - document tags_for_partition_fn in _partitioned_config API doc (#11977) by @sryza
  • b92d236 - [pyright] [gql] misc (#11720) by @smackesey
  • d175cc1 - [pyright] [gql] misc type errors 2 (#11947) by @smackesey
  • ac2c60f - [pyright] [gql] implementation/fetch_runs (#11896) by @smackesey
  • 8be6743 - Allow bare executor in Definitions (#11795) by @schrockn
  • 7908c41 - Add opt_iterable_param (#11796) by @schrockn
  • 234f9d2 - Use opt_iterable_param in Definitions (#11797) by @schrockn
  • 6d882f7 - [pyright] [gql] compute log manager (#11725) by @smackesey
  • e6000d7 - [dagster-snowflake-pandas] snowflake-sqlalchemy pin (#11984) by @smackesey
  • d2bc29e - [pyright] [gql] add ResolveInfo everywhere (#11723) by @smackesey
  • 6bb955f - add .ruff_cache to gitignore (#11987) by @Ramshackle-Jamathon
  • aa0c090 - small tweak to airflow integration page (#11985) by @sryza
  • 520f5c9 - Rename _core.decorator_utils.is_context_provided (#11982) by @smackesey
  • 7eafa58 - Add docker_container_op and execute_docker_container (#11831) by @gibsondan
  • ff40c1e - [perf] perf improvement for TimeWindowPartitionsSubset (#11850) by @OwenKephart
  • 84c7e77 - [dagit] Add ErrorBoundary to Dagit to reduce severity of React errors (#11824) by @bengotow
  • fc096a4 - [dagit] Repair markdownToPlaintext test failure (#11995) by @hellendag
  • f508bc4 - [import perf] defer CachingInstanceQueryer imports to avoid storage imports (#11905) by @alangenfeld
  • c647626 - Update error messaging for DB IO managers (#11815) by @jamiedemaria
  • 6fd547a - [import perf] moves to prevent grpc import (#11906) by @alangenfeld
  • 039702c - [import perf] test to prevent regressions (#11969) by @alangenfeld
  • 72e4f4c - remove legacy APIs from dagstermill tests (#11999) by @sryza
  • 75e394c - remove legacy APIs from dagster_shell_tests (#11998) by @sryza
  • 9230481 - [dagit] UI support for launching a single asset run with a range of partition keys (#11866) by @bengotow
  • 4836e9d - [dagit] Add more truncation test cases for Firefox, change const (#12001) by @bengotow
  • 4bb1f56 - Fix black pre-commit hook (#11979) by @smackesey
  • fbb6320 - [3/3 partition status cache] Update graphQL partition data (#10822) by @clairelin135
  • 33e31b0 - Upgrade base image on Dagster Dockerfiles to latest python version (#11863) by @gibsondan
  • 7c7b7ed - [pyright] [examples/docs-snippets] misc (#11921) by @smackesey
  • 743d3a5 - Add last_materialization_record to AssetEntry. (#11919) by @OwenKephart
  • f355bd8 - [asset-reconciliation][perf] Improve prefetch accuracy (#11991) by @OwenKephart
  • 2e18c4d - fix dagstermill tests (#12002) by @sryza
  • 5847229 - Type annotations for workspace code (#11958) by @smackesey
  • 8d8c104 - Consolidate GrpcServerRegistry/ProcessGrpcServerRegistry (#11990) by @smackesey
  • acc99cd - Refactor ProcessRegistryEntry (#11959) by @smackesey
  • 8c9b59f - [pyright] examples/project-fully-featured (#11908) by @smackesey
  • ffdd94a - [pyright] [dagster-k8s] misc (#11935) by @smackesey
  • 6c2fedf - [pyright] [core] tests (#11369) by @smackesey
  • c5c91a1 - [pyright] Type-ignores for various errors related to managed elements (#12014) by @smackesey
  • 66c473c - cleam up get_asset_events (#11913) by @alangenfeld
  • 4280d58 - PySpark type handler for snowflake io manager (#11542) by @jamiedemaria
  • 75e8fa6 - [asset-reconciliation] [perf] do not try to fetch materialization records for sources (#12011) by @OwenKephart
  • f781d93 - test(dagster-dbt): add testing for python 3.10 (#11912) by @rexledesma
  • f1be5a2 - fix: use new GitHub graphql resolvers for issue automation (#11877) by @rexledesma
  • a4df637 - [dagit] Export GhostDaggy with tooltip for Cloud usage (#11967) by @hellendag
  • 4596372 - [dagster-airflow] load_assets_from_airflow_dag (#11876) by @Ramshackle-Jamathon
  • d984b71 - Fix broken doc link for Snowflake credential setup (#12017) by @clayheaton
  • 0d9f907 - [dagster-fivetran] Add option to force-create materializations for tables not in API response (#11972) by @benpankow
  • f2125d5 - [dagit] Fix use of fragments causing Apollo caching error in partition health (#12030) by @bengotow
  • 7bf3b64 - Change endpoints to the ones that are used by airbyte UI (#12012) by @emilija-omnisend
  • aac1eb5 - lint fix (#12044) by @benpankow
  • 6da3aff - feat(dbt-cloud): compile run only if job has environment variable cache (#12042) by @rexledesma
  • e025d31 - fix(dbt-cloud): inherit generate docs settings for compile run (#12043) by @rexledesma
  • 0ca6358 - 1.1.15 changelog (#12051) by @jamiedemaria
  • 5d45afa - 1.1.15 by @elementl-devtools
dagster - 1.1.14 (core) / 0.17.14 (libraries)

Published by prha over 1 year ago

New

  • Large asset graphs can now be materialized in Dagit without needing to first enter an asset subset. Previously, if you wanted to materialize every asset in such a graph, you needed to first enter * as the asset selection before materializing the assets.
  • Added a pin of the sqlalchemy package to <2.0.0 due to a breaking change in that version.
  • Added a pin of the dbt-core package to <1.4.0 due to breaking changes in that release that affected the Dagster dbt integration. We plan to remove this pin in the next release.
  • Added a pin of the jupyter-client package to <8.0 due to an issue with the most recent release causing hangs while running dagstermill ops.

Bugfixes

  • Fixed an issue where the Backfills page in Dagit didn't show partition status for some backfills.
  • [dagster-aws] Fixed an issue where the EcsRunLauncher sometimes waited much longer than intended before retrying after a failure launching a run.
  • [dagster-mysql] Fixed an issue where some implementations of MySQL storage were raising invalid version errors.