An orchestration platform for the development, production, and observation of data assets.
APACHE-2.0 License
Bot releases are hidden (Show)
Published by mgasner over 3 years ago
ModeDefinition
that are not required by a pipeline no longer require runtime configuration. This should make it easier to share modes or resources among multiple pipelines.RetryRequested
is yielded from a notebook using dagstermill.yield_event
.--path-prefix
option, leading to failed GraphQL requests and broken pages. This bug was introduced in 0.11.4, and is now fixed.update_timestamp
column in the runs table is now updated with a UTC timezone, making it consistent with the create_timestamp
column.dagster-pandas
on pandas
. You can now include any version of pandas. (https://github.com/dagster-io/dagster/issues/3350)requests
in dagster
. Now only dagit
depends on requests.
pyrsistent
in dagster
.--config
help message (thanks @pawelad !)execute_pipeline
, the system would use the io manager that handled each output to perform the retrieval. Now, when using execute_pipeline
with the default in-process executor, the system directly captures the outputs of solids for use with the result object returned by execute_pipeline
. This may lead to slightly different behavior when retrieving outputs if switching between executors and using custom IO managers.K8sRunLauncher
and CeleryK8sRunLauncher
now add a dagster/image
tag to pipeline runs to document the image used. The DockerRunLauncher
has also been modified to use this tag (previously it used docker/image
)..
key shortcut to toggle visibility.@solid
can now decorate async def functions.PartitionGraphFragment
has been fixed.pipeline_name
that is not present in the current repository will now error out when the repository is created.generatePostgresqlPasswordSecret
toggle was added to allow the Helm chart to reference an external secret containing the Postgresql password (thanks @PenguinToast !)dagit.workspace
, which can be useful if you are managing your user deployments in a separate Helm release.dict
values in run_config
targeting Permissive
/ dict
config schemas, the ordering is now preserved.EventMetadataEntry.int
greater than 32 bits no longer cause dagit
errors.PresetDefinition.with_additional_config
no longer errors if the base config was empty (thanks @esztermarton !)StatusCode.RESOURCE_EXHAUSTED
for a large number of run requests, especially when the requested run configs were large.Community Contributions
dagster new project
now scaffolds setup.py
using your local dagster
pip version (thanks @taljaards!)New
description
is provided to the solid decorator, the docstring will now be used as the solid’s description.Bugfixes
dagster api execute_step
will mistakenly skip a step and output a non-DagsterEvent log. This affected the celery_k8s_job_executor
.Integrations
Community Contributions
dagster new-project
, which broke on the 0.11.0 release (Thank you @saulius!)New
Bugfixes
--path-prefix option
. Custom fonts and their CSS have now been removed, and system fonts are now used for both normal and monospace text.Published by prha over 3 years ago
AssetKeys
to solid outputs through either the OutputDefinition
or IOManager
, which allows Dagster to automatically generate asset lineage information for assets referenced in this way. Direct parents of an asset will appear in the Dagit Asset Catalog. See the asset docs to learn more.DynamicOutput
and map
from the last release, this release includes the ability to collect
over dynamically mapped outputs. You can see an example here.partition_days_offset
argument to the @daily_schedule
decorator that allows you to customize which partition is used for each execution of your schedule. The default value of this parameter is 1
, which means that a schedule that runs on day N will fill in the partition for day N-1. To create a schedule that uses the partition for the current day, set this parameter to 0
, or increase it to make the schedule use an earlier day’s partition. Similar arguments have also been added for the other partitioned schedule decorators (@monthly_schedule
, @weekly_schedule
, and @hourly_schedule
).description
parameter that takes in a human-readable string description and displays it on the corresponding landing page in Dagit.AssetMaterialization
now accepts a tags
argument. Tags can be used to filter assets in Dagit.QueuedRunCoordinator
daemon is now more resilient to errors while dequeuing runs. Previously runs which could not launch would block the queue. They will now be marked as failed and removed from the queue.dagster-daemon
process uses fewer resources and spins up fewer subprocesses to load pipeline information. Previously, the scheduler, sensor, and run queue daemon each spun up their own process for this–now they share a single process.dagster-daemon
process now runs each of its daemons in its own thread. This allows the scheduler, sensor loop, and daemon for launching queued runs to run in parallel, without slowing each other down.workspace.yaml
file to load your pipelines, you can now specify an environment variable for the server’s hostname and port.dagster run delete
CLI command to delete a run and its associated event log entries.fs_io_manager
now defaults the base directory to base_dir
via the Dagster instance’s local_artifact_storage
configuration. Previously, it defaulted to the directory where the pipeline was executed.handle_output
, load_input
, or a type check function, the log output now includes context about which input or output the error occurred during.BoolSource
config type (similar to the StringSource
type). The config value for this type can be a boolean literal or a pointer to an environment variable that is set to a boolean value.DagsterNoStepsToExecuteException
.OutputContext
passed to the has_output
method of MemoizableIOManager
now includes a working log
.workspace.yaml
file without restarting Dagit. To reload your workspace, navigate to the Status page and press the “Reload all” button in the Workspace section.step
and type
filtering now offers fuzzy search, all log event types are now searchable, and visual bugs within the input have been repaired. Additionally, the default setting for “Hide non-matches” has been flipped to true
.grpc_server
repository location, Dagit will automatically detect changes and prompt you to reload when the remote server updates.dagster asset wipe
.snowflake_resource
can now be configured to use the SQLAlchemy connector (thanks @basilvetas!)seed
and docs generate
are now available as solids in the library dagster-dbt
. (thanks @dehume-drizly!)dagster-spark
config schemas now support loading values for all fields via environment variables.gcs_pickle_io_manager
now also retries on 403 Forbidden errors, which previously would only retry on 429 TooManyRequests.K8sRunLauncher
and CeleryK8sRunLauncher
no longer reload the pipeline being executed just before launching it. The previous behavior ensured that the latest version of the pipeline was always being used, but was inconsistent with other run launchers. Instead, to ensure that you’re running the latest version of your pipeline, you can refresh your repository in Dagit by pressing the button next to the repository name.userDeployments.deployments
in the Helm chart, replicaCount
now defaults to 1 if not specified.dagster/dagster-k8s
and dagster/dagster-celery-k8s
can be used for all processes which don't require user code (Dagit, Daemon, and Celery workers when using the CeleryK8sExecutor). user-code-example
can be used for a sample user repository. The prior images (k8s-dagit
, k8s-celery-worker
, k8s-example
) are deprecated.dagster-k8s
, dagster-celery-k8s
, user-code-example
, and k8s-dagit-example
images to a public ECR registry in addition to DockerHub. If you are encountering rate limits when attempting to pull images from DockerHub, you should now be able to pull these images from public.ecr.aws/dagster..Values.dagsterHome
is now a global variable, available at .Values.global.dagsterHome
..Values.global.postgresqlSecretName
has been introduced, for subcharts to access the Dagster Helm chart’s generated Postgres secret properly..Values.userDeployments
has been renamed .Values.dagster-user-deployments
to reference the subchart’s values. When using Dagster User Deployments, enabling .Values.dagster-user-deployments.enabled
will create a workspace.yaml
for Dagit to locate gRPC servers with user code. To create the actual gRPC servers, .Values.dagster-user-deployments.enableSubchart
should be enabled. To manage the gRPC servers in a separate Helm release, .Values.dagster-user-deployments.enableSubchart
should be disabled, and the subchart should be deployed in its own helm release.Schedules now run in UTC (instead of the system timezone) if no timezone has been set on the schedule. If you’re using a deprecated scheduler like SystemCronScheduler
or K8sScheduler
, we recommend that you switch to the native Dagster scheduler. The deprecated schedulers will be removed in the next Dagster release.
Names provided to alias
on solids now enforce the same naming rules as solids. You may have to update provided names to meet these requirements.
The retries
method on Executor
should now return a RetryMode
instead of a Retries
. This will only affect custom Executor
classes.
Submitting partition backfills in Dagit now requires dagster-daemon
to be running. The instance setting in dagster.yaml
to optionally enable daemon-based backfills has been removed, because all backfills are now daemon-based backfills.
# removed, no longer a valid setting in dagster.yaml
backfill:
daemon_enabled: true
The corresponding value flag dagsterDaemon.backfill.enabled
has also been removed from the Dagster helm chart.
dagster.yaml
has been removed. The sensor daemon now runs in a continuous loop so this customization is no longer useful.# removed, no longer a valid setting in dagster.yaml
sensor_settings:
interval_seconds: 10
instance
argument to RunLauncher.launch_run
has been removed. If you have written a custom RunLauncher, you’ll need to update the signature of that method. You can still access the DagsterInstance
on the RunLauncher
via the _instance
parameter.has_config_entry
, has_configurable_inputs
, and has_configurable_outputs
properties of solid
and composite_solid
have been removed.name
argument to PipelineDefinition
has been removed, and the argument is now required.execute_run_with_structured_logs
and execute_step_with_structured_logs
internal CLI entry points have been removed. Use execute_run
or execute_step
instead.python_environment
key has been removed from workspace.yaml
. Instead, to specify that a repository location should use a custom python environment, set the executable_path
key within a python_file
, python_module
, or python_package
key. See the docs for more information on configuring your workspace.yaml
file.read
or to
keys accordingly.0.10.8
was released with a packaging issue in dagster-postgres
please upgrade to 0.10.9
Bugfixes
Community Contributions
seed
and docs generate
are now available as solids in thedagster-dbt
. (thanks @dehume-drizly!)New
Dagit now has a global search feature in the left navigation, allowing you to jump quickly to
pipelines, schedules, and sensors across your workspace. You can trigger search by clicking the
search input or with the / keyboard shortcut.
Timestamps in Dagit have been updated to be more consistent throughout the app, and are now
localized based on your browser’s settings.
Adding SQLPollingEventWatcher
for alternatives to filesystem or DB-specific listen/notify
functionality
We have added the BoolSource
config type (similar to the StringSource
type). The config value for
this type can be a boolean literal or a pointer to an environment variable that is set to a boolean
value.
The QueuedRunCoordinator
daemon is now more resilient to errors while dequeuing runs. Previously
runs which could not launch would block the queue. They will now be marked as failed and removed
from the queue.
When deploying your own gRPC server for your pipelines, you can now specify that connecting to that
server should use a secure SSL connection. For example, the following workspace.yaml
file specifies
that a secure connection should be used:
load_from:
- grpc_server:
host: localhost
port: 4266
location_name: 'my_grpc_server'
ssl: true
The dagster-daemon
process uses fewer resources and spins up fewer subprocesses to load pipeline
information. Previously, the scheduler, sensor, and run queue daemon each spun up their own process
for this–now they share a single process.
Integrations
dagster-k8s
, dagster-celery-k8s
, user-code-example
, andk8s-dagit-example
images to a public ECR registry in addition to DockerHub. If you aredagster-spark
config schemas now support loading values for all fields viaBugfixes
redis.internal
True
in helm chart.dagster-daemon
process sometimes left dangling subprocesses runningtag
method on solid invocations (as opposed to solidExperimental
MySQL (via dagster-mysql) is now supported as a backend for event log, run, & schedule storages.
Add the following to your dagster.yaml to use MySQL for storage:
run_storage:
module: dagster_mysql.run_storage
class: MySQLRunStorage
config:
mysql_db:
username: { username }
password: { password }
hostname: { hostname }
db_name: { database }
port: { port }
event_log_storage:
module: dagster_mysql.event_log
class: MySQLEventLogStorage
config:
mysql_db:
username: { username }
password: { password }
hostname: { hostname }
db_name: { db_name }
port: { port }
schedule_storage:
module: dagster_mysql.schedule_storage
class: MySQLScheduleStorage
config:
mysql_db:
username: { username }
password: { password }
hostname: { hostname }
db_name: { db_name }
port: { port }