dagster | Python Ecosystem Directory

Bot releases are visible (Hide)

dagster -

Published by prha over 4 years ago

New

RepositoryDefinition now takes schedule_defs and partition_set_defs directly. The loading
scheme for these definitions via repository.yaml under the scheduler: and partitions: keys
is deprecated and expected to be removed in 0.8.0.
Mark published modules as python 3.8 compatible.
The dagster-airflow package supports loading all Airflow DAGs within a directory path, file path,
or Airflow DagBag.
The dagster-airflow package supports loading all 23 DAGs in Airflow example_dags folder and
execution of 17 of them (see: make_dagster_repo_from_airflow_example_dags).
The dagster-celery CLI tools now allow you to pass additional arguments through to the underlying
celery CLI, e.g., running dagster-celery worker start -n my-worker -- --uid=42 will pass the
--uid flag to celery.
It is now possible to create a PresetDefinition that has no environment defined.
Added dagster schedule debug command to help debug scheduler state.
The SystemCronScheduler now verifies that a cron job has been successfully been added to the
crontab when turning a schedule on, and shows an error message if unsuccessful.

Breaking Changes

A dagster instance migrate is required for this release to support the new experimental assets
view.
Runs created prior to 0.7.8 will no longer render their execution plans as DAGs. We are only
rendering execution plans that have been persisted. Logs are still available.
Path is no longer valid in config schemas. Use str or dagster.String instead.
Removed the @pyspark_solid decorator - its functionality, which was experimental, is subsumed by
requiring a StepLauncher resource (e.g. emr_pyspark_step_launcher) on the solid.

Dagit

Merged "re-execute", "single-step re-execute", "resume/retry" buttons into one "re-execute" button
with three dropdown selections on the Run page.

Experimental

Added new asset_key string parameter to Materializations and created a new “Assets” tab in Dagit
to view pipelines and runs associated with these keys. The API and UI of these asset-based are
likely to change, but feedback is welcome and will be used to inform these changes.
Added an emr_pyspark_step_launcher that enables launching PySpark solids in EMR. The
"simple_pyspark" example demonstrates how it’s used.

Bugfix

Fixed an issue when running Jupyter notebooks in a Python 2 kernel through dagstermill with dagster
running in Python 3.
Improved error messages produced when dagstermill spins up an in-notebook context.
Fixed an issue with retrieving step events from CompositeSolidResult objects.

dagster -

Published by prha over 4 years ago

Breaking Changes

If you are launching runs using DagsterInstance.launch_run, this method now takes a run id instead of an instance of PipelineRun. Additionally, DagsterInstance.create_run and DagsterInstance.create_empty_run have been replaced by DagsterInstance.get_or_create_run and DagsterInstance.create_run_for_pipeline.
If you have implemented your own RunLauncher, there are two required changes:
- RunLauncher.launch_run takes a pipeline run that has already been created. You should remove any calls to instance.create_run in this method.
- Instead of calling startPipelineExecution (defined in the dagster_graphql.client.query.START_PIPELINE_EXECUTION_MUTATION) in the run launcher, you should call startPipelineExecutionForCreatedRun (defined in dagster_graphql.client.query.START_PIPELINE_EXECUTION_FOR_CREATED_RUN_MUTATION`
- Refer to the RemoteDagitRunLauncher for an example implementation.

New

Improvements to preset and solid subselection in the playground. An inline preview of the pipeline instead of a modal when doing subselection, and the correct subselection is chosen when selecting a preset.
Improvements to the log searching. Tokenization and autocompletion for searching messages types and for specific steps.
You can now view the structure of pipelines from historical runs, even if that pipeline no longer exists in the loaded repository or has changed structure.
Historical execution plans are now viewable, even if the pipeline has changed structure.
Added metadata link to raw compute logs for all StepStart events in PipelineRun view and Step view.
Improved error handling for the scheduler. If a scheduled run has config errors, the errors are persisted to the event log for the run and can be viewed in Dagit.

Bugfix

No longer manually dispose sqlalchemy engine in dagster-postgres
Made boto3 dependency in dagster-aws more flexible (#2418)
Fixed tooltip UI cleanup in partitioned schedule view

Documentation

Brand new documentation site, available at https://docs.dagster.io
The tutorial has been restructured to multiple sections, and the examples in intro_tutorial have been rearranged to separate folders to reflect this.

dagster -

Published by prha over 4 years ago

Breaking Changes

The execute_pipeline_with_mode and execute_pipeline_with_preset APIs have been dropped in
favor of new top level arguments to execute_pipeline, mode and preset.
The use of RunConfig to pass options to execute_pipeline has been deprecated, and RunConfig
will be removed in 0.8.0.
The execute_solid_within_pipeline and execute_solids_within_pipeline APIs, intended to support
tests, now take new top level arguments mode and preset.

New

The dagster-aws Redshift resource now supports providing an error callback to debug failed
queries.
We now persist serialized execution plans for historical runs. They will render correctly even if
the pipeline structure has changed or if it does not exist in the current loaded repository.
Clicking on a pipeline tag in the Runs view will apply that tag as a filter.

Bugfix

Fixed a bug where telemetry logger would create a log file (but not write any logs) even when
telemetry was disabled.

Experimental

The dagster-airflow package supports ingesting Airflow dags and running them as dagster pipelines
(see: make_dagster_pipeline_from_airflow_dag). This is in the early experimentation phase.
Improved the layout of the experimental partition runs table on the Schedules detailed view.

Documentation

Fixed a grammatical error (Thanks @flowersw!)

dagster -

Published by prha over 4 years ago

Breaking Changes

The default sqlite and dagster-postgres implementations have been altered to extract the
event step_key field as a column, to enable faster per-step queries. You will need to run
dagster instance migrate to update the schema. You may optionally migrate your historical event
log data to extract the step_key using the migrate_event_log_data function. This will ensure
that your historical event log data will be captured in future step-key based views. This
event_log data migration can be invoked as follows:
```
from dagster.core.storage.event_log.migration import migrate_event_log_data
from dagster import DagsterInstance

migrate_event_log_data(instance=DagsterInstance.get())
```
We have made pipeline metadata serializable and persist that along with run information.
While there are no user-facing features to leverage this yet, it does require an instance migration.
dagster instance migrate. If you have already run the migration for the event_log changes
above, you do not need to run it again. Any unforeseen errors related the the new snapshot_id
in the runs table or the new snapshots table are related to this migration.
dagster-pandas ColumnTypeConstraint has been removed in favor of ColumnDTypeFnConstraint and
ColumnDTypeInSetConstraint.

New

You can now specify that dagstermill output notebooks be yielded as an output from dagstermill
solids, in addition to being materialized.
You may now set the extension on files created using the FileManager machinery.
dagster-pandas typed PandasColumn constructors now support pandas 1.0 dtypes.
The Dagit Playground has been restructured to make the relationship between Preset, Partition
Sets, Modes, and subsets more clear. All of these buttons have be reconciled and moved to the
left side of the Playground.
Config sections that are required but not filled out in the Dagit playground are now detected
and labeled in orange.
dagster-celery config now support using env: to load from environment variables.

Bugfix

Fixed a bug where selecting a preset in dagit would not populate tags specified on the pipeline
definition.
Fixed a bug where metadata attached to a raised Failure was not displayed in the error modal in
dagit.
Fixed an issue where reimporting dagstermill and calling dagstermill.get_context() outside of
the parameters cell of a dagstermill notebook could lead to unexpected behavior.
Fixed an issue with connection pooling in dagster-postgres, improving responsiveness when using
the Postgres-backed storages.

Experimental

Added a longitudinal view of runs for on the Schedule tab for scheduled, partitioned pipelines.
Includes views of run status, execution time, and materializations across partitions. The UI is
in flux and is currently optimized for daily schedules, but feedback is welcome.

dagster -

Published by alangenfeld over 4 years ago

Dagit

Dagit now looks up an available port on which to run when the default port is
not available. (Thanks @rparrapy!)

dagster_pandas

Hydration and materialization are now configurable on dagster_pandas dataframes.

dagster_aws

The s3_resource no longer uses an unsigned session by default.

Bugfixes

Type check messages are now displayed in Dagit.
Failure metadata is now surfaced in Dagit.
Dagit now correctly displays the execution time of steps that error.
Error messages now appear correctly in console logging.
GCS storage is now more robust to transient failures.
Fixed an issue where some event logs could be duplicated in Dagit.
Fixed an issue when reading config from an environment variable that wasn't set.
Fixed an issue when loading a repository or pipeline from a file target on Windows.
Fixed an issue where deleted runs could cause the scheduler page to crash in Dagit.

Documentation

Expanded and improved docs and error messages.

dagster -

Published by alangenfeld over 4 years ago

Docs

New docs site at docs.dagster.io.
dagster.readthedocs.io is currently stale due to availability issues.

New

Improvements to S3 Resource. (Thanks @dwallace0723!)
Better error messages in Dagit.
Better font/styling support in Dagit.
Changed OutputDefinition to take is_required rather than is_optional argument. This is to
remain consistent with changes to Field in 0.7.1 and to avoid confusion
with python's typing and dagster's definition of Optional, which indicates None-ability,
rather than existence. is_optional is deprecated and will be removed in a future version.
Added support for Flower in dagster-k8s.
Added support for environment variable config in dagster-snowflake.

Bugfixes

Improved performance in Dagit waterfall view.
Fixed bug when executing solids downstream of a skipped solid.
Improved navigation experience for pipelines in Dagit.
Fixed for the dagster-aws CLI tool.
Fixed issue starting Dagit without DAGSTER_HOME set on windows.
Fixed pipeline subset execution in partition-based schedules.

dagster -

Published by alangenfeld over 4 years ago

New

It is now possible to configure a dagit instance to disable executing pipeline runs in a local
subprocess.
Resource initialization, teardown, and associated failure states now emit structured events
visible in Dagit. Structured events for pipeline errors and multiprocess execution have been
consolidated and rationalized.
Support Redis queue provider in dagster-k8s Helm chart.
Support external postgresql in dagster-k8s Helm chart.

Bugfix

Fixed an issue with inaccurate timings on some resource initializations.
Fixed an issue that could cause the multiprocess engine to spin forever.
Fixed an issue with default value resolution when a config value was set using SourceString.
Fixed an issue when loading logs from a pipeline belonging to a different repository in Dagit.
Fixed an issue with where the CLI command dagster schedule up would fail in certain scenarios
with the SystemCronScheduler.

Pandas

Column constraints can now be configured to permit NaN values.

Dagstermill

Removed a spurious dependency on sklearn.

Docs

Improvements and fixes to docs.
Restored dagster.readthedocs.io.

Experimental

An initial implementation of solid retries, throwing a RetryRequested exception, was added.
This API is experimental and likely to change.

Other

Renamed property runtime_type to dagster_type in definitions. The following are deprecated
and will be removed in a future version.
- InputDefinition.runtime_type is deprecated. Use InputDefinition.dagster_type instead.
- OutputDefinition.runtime_type is deprecated. Use OutputDefinition.dagster_type instead.
- CompositeSolidDefinition.all_runtime_types is deprecated. Use CompositeSolidDefinition.all_dagster_types instead.
- SolidDefinition.all_runtime_types is deprecated. Use SolidDefinition.all_dagster_types instead.
- PipelineDefinition.has_runtime_type is deprecated. Use PipelineDefinition.has_dagster_type instead.
- PipelineDefinition.runtime_type_named is deprecated. Use PipelineDefinition.dagster_type_named instead.
- PipelineDefinition.all_runtime_types is deprecated. Use PipelineDefinition.all_dagster_types instead.

dagster -

Published by alangenfeld over 4 years ago

New

It is now possible to use Postgres to back schedule storage by configuring
dagster_postgres.PostgresScheduleStorage on the instance.
Added the execute_pipeline_with_mode API to allow executing a pipeline in test with a specific
mode without having to specify RunConfig.
Experimental support for retries in the Celery executor.
It is now possible to set run-level priorities for backfills run using the Celery executor by
passing --celery-base-priority to dagster pipeline backfill.
Added the @weekly schedule decorator.

Deprecations

The dagster-ge library has been removed from this release due to drift from the underlying
Great Expectations implementation.

dagster-pandas

PandasColumn now includes an is_optional flag, replacing the previous
ColumnExistsConstraint.
You can now pass the ignore_missing_values flag to PandasColumn in order to apply column
constraints only to the non-missing rows in a column.

dagster-k8s

The Helm chart now includes provision for an Ingress and for multiple Celery queues.

Documentation

Improvements and fixes.

dagster -

Published by alangenfeld over 4 years ago

New

Added the IntSource type, which lets integers be set from environment variables in config.
You may now set tags on pipeline definitions. These will resolve in the following cases:
1. Loading in the playground view in Dagit will pre-populate the tag container.
2. Loading partition sets from the preset/config picker will pre-populate the tag container with
  the union of pipeline tags and partition tags, with partition tags taking precedence.
3. Executing from the CLI will generate runs with the pipeline tags.
4. Executing programmatically using the execute_pipeline api will create a run with the union
  of pipeline tags and RunConfig tags, with RunConfig tags taking precedence.
5. Scheduled runs (both launched and executed) will have the union of pipeline tags and the
  schedule tags function, with the schedule tags taking precedence.
Output materialization configs may now yield multiple Materializations, and the tutorial has
been updated to reflect this.
We now export the SolidExecutionContext in the public API so that users can correctly type hint
solid compute functions.

Dagit

Pipeline run tags are now preserved when resuming/retrying from Dagit.
Scheduled run stats are now grouped by partition.
A "preparing" section has been added to the execution viewer. This shows steps that are in
progress of starting execution.
Markers emitted by the underlying execution engines are now visualized in the Dagit execution
timeline.

Bugfix

Resume/retry now works as expected in the presence of solids that yield optional outputs.
Fixed an issue where dagster-celery workers were failing to start in the presence of config
values that were None.
Fixed an issue with attempting to set threads_per_worker on Dask distributed clusters.

dagster-postgres

All postgres config may now be set using environment variables in config.

dagster-aws

The s3_resource now exposes a list_objects_v2 method corresponding to the underlying boto3
API. (Thanks, @basilvetas!)
Added the redshift_resource to access Redshift databases.

dagster-k8s

The K8sRunLauncher config now includes the load_kubeconfig and kubeconfig_file options.

Documentation

Fixes and improvements.

Dependencies

dagster-airflow no longer pins its werkzeug dependency.

Community

We've added opt-in telemetry to Dagster so we can collect usage statistics in order to inform
development priorities. Telemetry data will motivate projects such as adding features in
frequently-used parts of the CLI and adding more examples in the docs in areas where users
encounter more errors.

We will not see or store solid definitions (including generated context) or pipeline definitions
(including modes and resources). We will not see or store any data that is processed within solids
and pipelines.

If you'd like to opt in to telemetry, please add the following to $DAGSTER_HOME/dagster.yaml:
```
telemetry:
  enabled: true
```
Thanks to @basilvetas and @hspak for their contributions!

dagster -

Published by alangenfeld over 4 years ago

Breaking Changes

default_value in Field no longer accepts native instances of python enums. Instead
the underlying string representation in the config system must be used.
default_value in Field no longer accepts callables.
The dagster_aws imports have been reorganized; you should now import resources from
dagster_aws.<AWS service name>. dagster_aws provides s3, emr, redshift, and cloudwatch
modules.
The dagster_aws S3 resource no longer attempts to model the underlying boto3 API, and you can
now just use any boto3 S3 API directly on a S3 resource, e.g.
context.resources.s3.list_objects_v2. (#2292)

New

New Playground view in dagit showing an interactive config map
Improved storage and UI for showing schedule attempts
Added the ability to set default values in InputDefinition
Added CLI command dagster pipeline launch to launch runs using a configured RunLauncher
Added ability to specify pipeline run tags using the CLI
Added a pdb utility to SolidExecutionContext to help with debugging, available within a solid as context.pdb
Added PresetDefinition.with_additional_config to allow for config overrides
Added resource name to log messages generated during resource initialization
Added grouping tags for runs that have been retried / reexecuted.

Bugfix

Fixed a bug where date range partitions with a specified end date was clipping the last day
Fixed an issue where some schedule attempts that failed to start would be marked running forever.
Fixed the @weekly partitioned schedule decorator
Fixed timezone inconsistencies between the runs view and the schedules view
Integers are now accepted as valid values for Float config fields
Fixed an issue when executing dagstermill solids with config that contained quote characters.

dagstermill

The Jupyter kernel to use may now be specified when creating dagster notebooks with the --kernel flag.

dagster-dbt

dbt_solid now has a Nothing input to allow for sequencing

dagster-k8s

Added get_celery_engine_config to select celery engine, leveraging Celery infrastructure

Documentation

Improvements to the airline and bay bikes demos
Improvements to our dask deployment docs (Thanks jswaney!!)

dagster - Waiting To Exhale

Published by asingh16 over 4 years ago

🎆 🚢 🎆 Dagster 0.7.0: Waiting To Exhale 😤 😌 🍵

We are pleased to announce version 0.7.0 of Dagster, codenamed “Waiting To Exhale”. We set out to make Dagster a solution for production-grade pipelines on modern cloud infrastructure. In service of that goal, we needed to fill missing gaps and incorporate feedback from the community at large.

Our last release, 0.6.0, expanded Dagster from local developer experience to a hostable product, allowing for scheduling, execution, and monitoring of pipelines in the cloud.

This release goes further, supporting pipelines with 100s and 1000s of nodes, deployable to modern, scalable cloud infrastructure, with dramatically improved monitoring tools, as well as other features.

Given this, 0.7.0 introduces the following:

Revamped, Scalable Dagit A completely redesigned Dagit with a more intuitive navigation structure, beautiful look-and-feel, and massive performance improvements to handle pipelines with hundreds or even thousands of nodes.
Execution Viewer Executing and historical runs within Dagit uses a new live-updating, queryable waterfall viewer. See below for a preview of the new UI:

https://media.giphy.com/media/Rhx6ujovXlvuKaLCGY/giphy.gif

A Dagster-K8s library which provides the ability to launch runs in ephemeral Kubernetes Pods, as well as an early helm chart for executing pipelines.
A Dagster-Celery library designed to work with K8s that provides global resource management using dedicated queues, and distributed execution of dagster pipelines across a cluster.
Streamlined scheduler configuration and new backfill APIs and tools to help manage your scheduled workflows in production.
A Dagster-Pandas integration that provides useful APIs for dataframe validation, summary statistics emission, and auto-documentation in dagit so that you can better understand and control how data flows through your pipelines.
Redesigned documentation, examples, and guides to help flesh out the core ideas behind the system.

Warning

There are a substantial number of breaking changes in the 0.7.0 release. These changes effect the scheduler system, config system, required resources, and the type system. We apologize for the thrash, and thank you for bearing with us!

For more info on changes check out the following resources:

Changelog: https://github.com/dagster-io/dagster/blob/master/CHANGES.md

0.7.0 migration guide: https://github.com/dagster-io/dagster/blob/master/070_MIGRATION.md

dagster - 0.4.0

Published by natekupp over 5 years ago

API Changes

There is now a new top-level configuration section storage which controls whether or not
execution should store intermediate values and the history of pipeline runs on the filesystem,
on S3, or in memory. The dagster CLI now includes options to list and wipe pipeline run
history. Facilities are provided for user-defined types to override the default serialization
used for storage.
Similarily, there is a new configuration for RunConfig where the user can specify
intermediate value storage via an API.
OutputDefinition now contains an explicit is_optional parameter and defaults to being
not optional.
New functionality in dagster.check: is_list
New functionality in dagster.seven: py23-compatible FileNotFoundError, json.dump,
json.dumps.
Dagster default logging is now multiline for readability.
The Nothing type now allows dependencies to be constructed between solids that do not have
data dependencies.
Many error messages have been improved.
throw_on_user_error has been renamed to raise_on_error in all APIs, public and private

GraphQL

The GraphQL layer has been extracted out of Dagit into a separate dagster-graphql package.
startSubplanExecution has been replaced by executePlan.
startPipelineExecution now supports reexecution of pipeline subsets.

Dagit

It is now possible to reexecute subsets of a pipeline run from Dagit.
Dagit's Execute tab now opens runs in separate browser tabs and a new Runs tab allows you to
browse and view historical runs.
Dagit no longer scaffolds configuration when creating new Execute tabs. This functionality will
be refined and revisited in the future.
Dagit's Explore tab is more performant on large DAGs.
The dagit -q command line flag has been deprecated in favor of a separate command-line
dagster-graphql utility.
The execute button is now greyed out when Dagit is offline.
The Dagit UI now includes more contextual cues to make the solid in focus and its connections
more salient.
Dagit no longer offers to open materializations on your machine. Clicking an on-disk
materialization now copies the path to your clipboard.
Pressing Ctrl-Enter now starts execution in Dagit's Execute tab.
Dagit properly shows List and Nullable types in the DAG view.

Dagster-Airflow

Dagster-Airflow includes functions to dynamically generate containerized (DockerOperator-based)
and uncontainerized (PythonOperator-based) Airflow DAGs from Dagster pipelines and config.

Libraries

Dagster integration code with AWS, Great Expectations, Pandas, Pyspark, Snowflake, and Spark
has been reorganized into a new top-level libraries directory. These modules are now
importable as dagster_aws, dagster_ge, dagster_pandas, dagster_pyspark,
dagster_snowflake, and dagster_spark.
Removed dagster-sqlalchemy and dagma

Examples

Added the event-pipeline-demo, a realistic web event data pipeline using Spark and Scala.
Added the Pyspark pagerank example, which demonstrates how to incrementally introduce dagster
into existing data processing workflows.

Documentation

Docs have been expanded, reorganized, and reformatted.

dagster - 0.2.8.post3

Published by schrockn over 5 years ago

Hotfix to not put config values in error messages. Had to re-release because of packaging errors uploaded pypi (.pyc files or similar were included)

dagster - v.0.2.8.post0

Published by schrockn almost 6 years ago

Pushing an update because dagit 0.2.8 was getting out-of-date code.

dagster - v0.2.8

Published by schrockn almost 6 years ago

Version bump to deal with likely pypi issue around using a fourth-level version number
Added more elegant syntax for building solid and context configs

dagster - v.0.2.7

Published by schrockn almost 6 years ago

Version 0.2.7 Release Notes

Most notable improvements in this release are bunch of improvements to dagit, most notably hot reloading and the in-browser rendering of python error. Also the ability to scaffold configs from the command line is the first fruit of the rearchitecting of the config system.

Dagster improvements:
- Added scaffold_config command which generates the template of a yaml file needed to drive the execution of a particular pipeline
- Added the ability to automatically serialize intermediate inputs as they flow between solids. Consider this alpha quality. It is currently hard-coded to write out to /tmp/dagster/runs/<<run_id>>
Dagit improvements:
- Hot-Reloading and in-browser rendering of python errors.
- Scrolling and performance improvements
- Keyboard short cuts to navigate between solids using arrow keys
- In-app previews of notebooks for dagstermill solids

dagster - v0.2.6

Published by schrockn about 6 years ago

Changes:

'run_id' value automatically included in ExecutionContext context
stack. This is a uuid.
Config system update:

This is a significant change in the config system. Now the top level environment objects (and all descendants) are now part of the dagster type system. Unique types are generated on a per-pipeline basis. This unlocks a few things:

The entirety of yaml config files are now type-checked in the same fashion as the user-defined config.
One can now pass dictionaries to execute_pipeline that mimic the yaml files exactly. You no longer have to use the dagster.config APIs (although those still work)
The entire config system is queryable via graphql (and therefore shows up in dagit). This adds some noise to the type browser (we can mitigate that soon), but this will enable the building of a config-editor is fully aware of the dagster type system.
This has one breaking change. The yaml file's format has changed slightly.

Previously:

context:
   name: context_name
   config: some_config_value

Now:

context:
   context_name:
       config: some_config_value

BREAKING CHANGE: Config format change. See above.

dagster - v0.2.5

Published by schrockn about 6 years ago

Version bump to 0.2.5 (#227)

Added the Type Explorer in Dagit. You can now browse all the types
declared in a pipeline.
Added the --watch/--no-watch flag to dagit. This allows you to turn
off watching in cases where there are two many files below the
current working directory.

dagster - v.0.2.4

Published by schrockn about 6 years ago

This version bump contains a few changes (including one breaking
change).

New, radically improved version of dagit. Vertical layout, and a
beautiful new design. H/T to @bengotow for this spectacular work.
All types now require names. This is breaking change for
ConfigDictionary, which did not require a name. You will
have to change your calls to ConfigDictionary or
ConfigDefinition.config_dict to include a name that is unique to the
Pipeline.
Solids default to take no config definition, rather than a config
definition typed as any.

dagster - v0.2.3

Published by schrockn about 6 years ago

Driving factor to release this is a bug in the command line interface in 0.2.2 (https://github.com/dagster-io/dagster/issues/207)

Other changes in this release:

CLI interface has changed slightly. Whenver dagit or dagster needs to
specify a function to load a repo or a pipeline, us the -n/--fn-name
flag combo. Before this was split out into to different use cases in
dagster.
We now have the ability to reuse a single solid definition multiple
times within the same pipeline using the SolidInstance API. See the
corresponding tutorial section for more details.
Documentation continues to improve.