You are viewing an unreleased or outdated version of the documentation

Changelog#

1.6.6 (core) / 0.22.6 (libraries)#

New#

  • Dagster officially supports Python 3.12.
  • dagster-polars has been added as an integration. Thanks @danielgafni!
  • [dagster-dbt] @dbt_assets now supports loading projects with semantic models.
  • [dagster-dbt] @dbt_assets now supports loading projects with model versions.
  • [dagster-dbt] get_asset_key_for_model now supports retrieving asset keys for seeds and snapshots. Thanks @aksestok!
  • [dagster-duckdb] The Dagster DuckDB integration supports DuckDB version 0.10.0.
  • [UPath I/O manager] If a non-partitioned asset is updated to have partitions, the file containing the non-partitioned asset data will be deleted when the partitioned asset is materialized, rather than raising an error.

Bugfixes#

  • Fixed an issue where creating a backfill of assets with dynamic partitions and a backfill policy would sometimes fail with an exception.
  • Fixed an issue with the type annotations on the @asset decorator causing a false positive in Pyright strict mode. Thanks @tylershunt!
  • [ui] On the asset graph, nodes are slightly wider allowing more text to be displayed, and group names are no longer truncated.
  • [ui] Fixed an issue where the groups in the asset graph would not update after an asset was switched between groups.
  • [dagster-k8s] Fixed an issue where setting the security_context field on the k8s_job_executor didn't correctly set the security context on the launched step pods. Thanks @krgn!

Experimental#

  • Observable source assets can now yield ObserveResults with no data_version.
  • You can now include FreshnessPolicys on observable source assets. These assets will be considered “Overdue” when the latest value for the “dagster/data_time” metadata value is older than what’s allowed by the freshness policy.
  • [ui] In Dagster Cloud, a new feature flag allows you to enable an overhauled asset overview page with a high-level stakeholder view of the asset’s health, properties, and column schema.

Documentation#

  • Updated docs to reflect newly-added support for Python 3.12.

Dagster Cloud#

  • [kubernetes] Fixed an issue where the Kubernetes agent would sometimes leave dangling kubernetes services if the agent was interrupted during the middle of being terminated.

1.6.5 (core) / 0.22.5 (libraries)#

New#

  • Within a backfill or within auto-materialize, when submitting runs for partitions of the same assets, runs are now submitted in lexicographical order of partition key, instead of in an unpredictable order.
  • [dagster-k8s] Include k8s pod debug info in run worker failure messages.
  • [dagster-dbt] Events emitted by DbtCliResource now include metadata from the dbt adapter response. This includes fields like rows_affected, query_id from the Snowflake adapter, or bytes_processed from the BigQuery adapter.

Bugfixes#

  • A previous change prevented asset backfills from grouping multiple assets into the same run when using BackfillPolicies under certain conditions. While the backfills would still execute in the proper order, this could lead to more individual runs than necessary. This has been fixed.
  • [dagster-k8s] Fixed an issue introduced in the 1.6.4 release where upgrading the Helm chart without upgrading the Dagster version used by user code caused failures in jobs using the k8s_job_executor.
  • [instigator-tick-logs] Fixed an issue where invoking context.log.exception in a sensor or schedule did not properly capture exception information.
  • [asset-checks] Fixed an issue where additional dependencies for dbt tests modeled as Dagster asset checks were not properly being deduplicated.
  • [dagster-dbt] Fixed an issue where dbt model, seed, or snapshot names with periods were not supported.

Experimental#

  • @observable_source_asset-decorated functions can now return an ObserveResult. This allows including metadata on the observation, in addition to a data version. This is currently only supported for non-partitioned assets.
  • [auto-materialize] A new AutoMaterializeRule.skip_on_not_all_parents_updated_since_cron class allows you to construct AutoMaterializePolicys which wait for all parents to be updated after the latest tick of a given cron schedule.
  • [Global op/asset concurrency] Ops and assets now take run priority into account when claiming global op/asset concurrency slots.

Documentation#

  • Fixed an error in our asset checks docs. Thanks @vaharoni!
  • Fixed an error in our Dagster Pipes Kubernetes docs. Thanks @cameronmartin!
  • Fixed an issue on the Hello Dagster! guide that prevented it from loading.
  • Add specific capabilities of the Airflow integration to the Airflow integration page.
  • Re-arranged sections in the I/O manager concept page to make info about using I/O versus resources more prominent.

0.9.17#

New

  • [dagster-dask] Allow connecting to an existing scheduler via its address
  • [dagster-aws] Importing dagster_aws.emr no longer transitively importing dagster_spark
  • [dagster-dbr] CLI solids now emit materializations

Community contributions

  • Docs fix (Thanks @kaplanbora!)

Bug fixes

  • PipelineDefinition 's that do not meet resource requirements for its types will now fail at definition time
  • Dagit bugfixes and improvements
  • Fixed an issue where a run could be left hanging if there was a failure during launch

Deprecated

  • We now warn if you return anything from a function decorated with @pipeline. This return value actually had no impact at all and was ignored, but we are making changes that will use that value in the future. By changing your code to not return anything now you will avoid any breaking changes with zero user-visible impact.

0.9.16#

Breaking Changes

  • Removed DagsterKubernetesPodOperator in dagster-airflow.
  • Removed the execute_plan mutation from dagster-graphql.
  • ModeDefinition, PartitionSetDefinition, PresetDefinition, @repository, @pipeline, and ScheduleDefinition names must pass the regular expression r"^[A-Za-z0-9_]+$" and not be python keywords or disallowed names. See DISALLOWED_NAMES in dagster.core.definitions.utils for exhaustive list of illegal names.
  • dagster-slack is now upgraded to use slackclient 2.x - this means that this resource will only support Python 3.6 and above.
  • [K8s] Added a health check to the helm chart for user deployments, which relies on a new dagster api grpc-health-check cli command present in Dagster 0.9.16 and later.

New

  • Add helm chart configurations to allow users to configure a K8sRunLauncher, in place of the CeleryK8sRunLauncher.
  • “Copy URL” button to preserve filter state on Run page in dagit

Community Contributions

  • Dagster CLI options can now be passed in via environment variables (Thanks @xinbinhuang!)
  • New --limit flag on the dagster run list command (Thanks @haydarai!)

Bugfixes

  • Addressed performance issues loading the /assets table in dagit. Requires a data migration to create a secondary index by running dagster instance reindex.
  • Dagit bugfixes and improvements

0.9.15#

Breaking Changes

  • CeleryDockerExecutor no longer requires a repo_location_name config field.
  • executeRunInProcess was removed from dagster-graphql.

New

  • Dagit: Warn on tab removal in playground
  • Display versions CLI: Added a new CLI that displays version information for a memoized run. Called via dagster pipeline list_versions.
  • CeleryDockerExecutor accepts a network field to configure the network settings for the Docker container it connects to for execution.
  • Dagit will now set a statement timeout on supported instance DBs. Defaults to 5s and can be controlled with the --db-statement-timeout flag

Community Contributions

  • dagster grpc requirements are now more friendly for users (thanks @jmo-qap!)
  • dagster.utils now has is_str (thanks @monicayao!)
  • dagster-pandas can now load dataframes from pickle (thanks @mrdrprofuroboros!)
  • dagster-ge validation solid factory now accepts name (thanks @haydarai!)

Bugfixes

  • Dagit bugfixes and improvements
  • Fixed an issue where dagster could fail to load large pipelines.
  • Fixed a bug where experimental arg warning would be thrown even when not using versioned dagster type loaders.
  • Fixed a bug where CeleryDockerExecutor was failing to execute pipelines unless they used a legacy workspace config.
  • Fixed a bug where pipeline runs using IntMetadataEntryData could not be visualized in dagit.

Experimental

  • Improve the output structure of dagster-dbt solids.
  • Version-based memoization over outputs stored in the intermediate store now works

Documentation

  • Fix a code snippet rendering issue in Overview: Assets & Materializations
  • Fixed all python code snippets alignment across docs examples

0.9.14#

New

  • Steps down stream of a failed step no longer report skip events and instead simply do not execute.
  • dagit-debug can load multiple debug files.
  • dagit now has a Debug Console Logging feature flag accessible at /flags .
  • Telemetry metrics are now taken when scheduled jobs are executed.
  • With memoized reexecution, we now only copy outputs that current plan won't generate
  • Document titles throughout dagit

Community Contributions

  • [dagster-ge] solid factory can now handle arbitrary types (thanks @sd2k!)
  • [dagster-dask] utility options are now available in loader/materializer for Dask DataFrame (thanks @kinghuang!)

Bugfixes

  • Fixed an issue where run termination would sometimes be ignored or leave the execution process hanging
  • [dagster-k8s] fixed issue that would cause timeouts on clusters with many jobs
  • Fixed an issue where reconstructable was unusable in an interactive environment, even when the pipeline is defined in a different module.
  • Bugfixes and UX improvements in dagit

Experimental

  • AssetMaterializations now have an optional “partition” attribute

0.9.13#

Bugfixes

  • Fixes an issue using build_reconstructable_pipeline.
  • Improved loading times for the asset catalog in Dagit.

Documentations

  • Improved error messages when invoking dagit from the CLI with bad arguments.