Fix: Use drop cascade in janitor by erindru · Pull Request #5133 · SQLMesh/sqlmesh

erindru · 2025-08-12T01:44:55Z

Second attempt at #5098

The solution in #5098 unfortunately needed to read in all the view snapshot records in order to work out the dependency graph of what should be dropped, because the way our state database is currently structured means that this operation can't easily be pushed down to the database level.

Fetching a large amount of snapshots from state sync can cause memory problems in other state sync implementations that buffer instead of stream so this approach was considered a non-starter.

In addition, it turns out that having dangling pointers in state is not as bad as I originally thought it was due to "create if not exists"-style logic elsewhere.

So this PR does the following:

Implement DROP CASCADE in the SnapshotEvaluator cleanup method. This leaves dangling pointers in state but that is considered acceptable
Check that it works by producing a scenario on Postgres that was previously failing

erindru · 2025-08-12T01:46:40Z

sqlmesh/core/engine_adapter/athena.py

        return None

-    def drop_table(self, table_name: TableName, exists: bool = True) -> None:
+    def drop_table(self, table_name: TableName, exists: bool = True, **kwargs: t.Any) -> None:


Since drop_view() already had **kwargs I figured it was ok to add it to drop_table() as well

erindru · 2025-08-12T01:48:08Z

sqlmesh/core/snapshot/evaluator.py

    def delete(self, name: str, **kwargs: t.Any) -> None:
        _check_table_db_is_physical_schema(name, kwargs["physical_schema"])
-        self.adapter.drop_table(name)
+        self.adapter.drop_table(name, cascade=kwargs.pop("cascade", False))


I deliberately just pop off the cascade argument because delete() is called with a bunch of other arguments that arent relevant for drop_table() but end up making their way to the exp.Drop AST node in the EngineAdapter if they aren't filtered out here

themisvaltinos · 2025-08-18T15:23:48Z

sqlmesh/core/snapshot/evaluator.py

+                    # we need to set cascade=true or we will get a 'cant drop because other objects depend on it'-style
+                    # error on engines that enforce referential integrity, such as Postgres
+                    # this situation can happen when a snapshot expires but downstream view snapshots that reference it have not yet expired
+                    cascade=True,


should we control this with a flag which is set if the engine supports cascade or not (maybe from the schema differ)? unless im doing something wrong I tried with a BigQuery project for example to run the janitor which stops when it tries to drop a table with the error Syntax error: Expected end of input but got keyword CASCADE at

relevant docs: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#drop_table_statement it seems cascade is supported for schema but not table

should we control this with a flag which is set if the engine supports cascade or not (maybe from the schema differ)?

Imo, this should happen further downstream, e.g., in the adapter itself.

Oh, nice catch, the downside of not running this test across all engines.

I naively thought SQLGlot would not generate the CASCADE output if it's unsupported (even if cascade=true on the AST node) but I guess the fact it doesn't is why its not considered a validator.

I'll improve the coverage and make sure this works on all engines

erindru · 2025-08-19T04:51:50Z

.circleci/continue_config.yml

-            branches:
-              only:
-                - main
+          #filters:


TODO: revert prior to merge

erindru · 2025-08-19T04:54:41Z

tests/core/engine_adapter/integration/test_integration.py

+        EnvironmentSuffixTarget.CATALOG,
+    ],
+)
+def test_janitor(


This test inits a project, creates a dev env, invalidates it, runs the janitor to clean it up and checks it was cleaned up

Across every engine we support

For every EnvironmentSuffixTarget we support

themisvaltinos

lgtm, nice the supported cascade list in the adapters is a much better solution to cover every operation than the flag i was suggesting for only this use case and thanks for adding the integration test!

georgesittas

Did a quick pass– looks reasonable.

georgesittas · 2025-08-20T13:45:54Z

sqlmesh/core/engine_adapter/base.py

+        if cascade and kind.upper() in self.SUPPORTED_DROP_CASCADE_OBJECT_KINDS:
+            drop_args["cascade"] = cascade


Since we're expecting CASCADE semantics, should this warn if cascade is unsupported?

I think we don't really "expect" the CASCADE semantics per se. We just want to delete our stuff without error. We're forced to do CASCADE because otherwise the engine won't let us delete.

That's fair; I was thinking that a warning could simply surface the fact that some objects were left orphan because we couldn't cascade. Although, if we don't know which objects depend on the removed thing, then its value is probably questionable anyway.

Yeah I did consider warning but then I noticed we silently do nothing elsewhere so decided to keep with that strategy

This reverts commit 23340ce.

erindru mentioned this pull request Aug 12, 2025

Fix: Include unexpired downstream views when cleaning up expired tables #5098

Closed

erindru commented Aug 12, 2025

View reviewed changes

erindru marked this pull request as ready for review August 12, 2025 02:14

erindru force-pushed the erin/janitor-drop-cascade branch 4 times, most recently from 0718546 to 23340ce Compare August 17, 2025 23:02

themisvaltinos reviewed Aug 18, 2025

View reviewed changes

erindru force-pushed the erin/janitor-drop-cascade branch from a90feca to 095fbdc Compare August 19, 2025 04:51

erindru commented Aug 19, 2025

View reviewed changes

.circleci/continue_config.yml Outdated

branches:

only:

- main

#filters:

Copy link

Collaborator Author

erindru Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: revert prior to merge

erindru commented Aug 19, 2025

View reviewed changes

themisvaltinos approved these changes Aug 20, 2025

View reviewed changes

georgesittas approved these changes Aug 20, 2025

View reviewed changes

izeigerman approved these changes Aug 20, 2025

View reviewed changes

erindru force-pushed the erin/janitor-drop-cascade branch from 00b6249 to de3ad7f Compare August 20, 2025 20:27

erindru added 10 commits August 20, 2025 21:28

Fix: Use drop cascade in janitor

2498ffb

mypy

37fb0e3

Revert "mypy"

10b0797

This reverts commit 23340ce.

Fix test

046b42e

Add janitor test across all adapters and fix drop cascade in BigQuery

057d24c

Enable cloud engines

1daf71e

Add supported drop cascade object indicators

65fb037

Fix test

879417c

reinstate branch filter

242d4da

Fix test

3fd21fc

erindru force-pushed the erin/janitor-drop-cascade branch from 3dc4b23 to 3fd21fc Compare August 20, 2025 21:28

erindru merged commit f73cdfe into main Aug 20, 2025
27 of 30 checks passed

erindru deleted the erin/janitor-drop-cascade branch August 20, 2025 22:58

		if cascade and kind.upper() in self.SUPPORTED_DROP_CASCADE_OBJECT_KINDS:
		drop_args["cascade"] = cascade

Conversation

erindru commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

erindru Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

themisvaltinos Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

themisvaltinos left a comment

Choose a reason for hiding this comment

Uh oh!

georgesittas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

georgesittas Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

erindru commented Aug 12, 2025 •

edited

Loading

erindru Aug 12, 2025 •

edited

Loading

themisvaltinos Aug 18, 2025 •

edited

Loading

georgesittas Aug 20, 2025 •

edited

Loading