Fix!: Avoid using rendered query when computing the data hash by izeigerman · Pull Request #5256 · SQLMesh/sqlmesh

izeigerman · 2025-08-28T22:14:44Z

This update eliminates the need for a rendered query when computing the model’s data hash. This means that the categorizer can now decide that the snapshot represents a metadata change even though the data hash has changed.

Doing this has the following benefits:

Faster fingerprint calculation
Eliminates the dependency on SQLGlot optimizer when calculating fingerprints which brings us one step closer to eliminating snapshot record migration as a result of the product upgrade

tobymao · 2025-08-28T22:18:04Z

sqlmesh/core/model/definition.py

+    def _data_hash_values(self) -> t.List[str]:
+        return [
+            *self._data_hash_values_no_query,
+            gen(self.query, comments=False),


can you get rid of this one to and just hash the raw sql?

If we hash raw sql, doesnt that make it sensitive to whitespace changes?

yes, that's why we now also check is_metadata_only_change when data hashes don't match

tobymao · 2025-08-29T23:03:34Z

sqlmesh/core/audit/definition.py


+    @property
+    def query(self) -> t.Union[exp.Query, d.JinjaQuery]:
+        return t.cast(t.Union[exp.Query, d.JinjaQuery], self.query_.parse(self.dialect))


is this called multiple times? should we cache this somehow?

i see, you use a class?

yes, it's cached inside ParsableQuery

tobymao · 2025-08-29T23:07:31Z

sqlmesh/core/model/common.py

+    _parsed: t.Optional[exp.Expression] = None
+    _parsed_dialect: t.Optional[str] = None
+
+    def parse(self, dialect: str) -> exp.Expression:


does the dialect ever change for a model outside of a test?

It shouldn't, why? I'd rather implement this correctly in case circumstances change.

tobymao · 2025-08-29T23:07:59Z

sqlmesh/core/model/common.py

+        cls, parsed_expression: exp.Expression, dialect: str, use_meta_sql: bool = False
+    ) -> ParsableSql:
+        sql = (
+            parsed_expression.meta.get("sql") or parsed_expression.sql(dialect=dialect)


what are the situation where we wouldn't want to use meta sql?

when I'm using a custom loader and do create_sql_model directly with a query. I don't think we can trust the correctness of the meta sql in that case.

sqlmesh/core/model/common.py

izeigerman requested a review from a team August 28, 2025 22:14

tobymao reviewed Aug 28, 2025

View reviewed changes

izeigerman force-pushed the chore-use-raw-query-in-fingerprint branch 3 times, most recently from 0ad4b2b to e4f3f1e Compare August 29, 2025 21:11

tobymao reviewed Aug 29, 2025

View reviewed changes

sqlmesh/core/model/common.py Show resolved Hide resolved

izeigerman added 8 commits September 2, 2025 09:05

Fix!: Avoid using rendered query when computing the data hash

f0798ba

switch to storing raw sql

77f4197

fix tests

353dbdd

switch to storing raw sql in audits

07b0ec5

use raw sql for pre- / post- statements

6062c94

cosmetic

7156542

test original sql

988379c

extend the audit load test

26226d9

izeigerman force-pushed the chore-use-raw-query-in-fingerprint branch from 0602afa to 26226d9 Compare September 2, 2025 16:18

tobymao approved these changes Sep 2, 2025

View reviewed changes

izeigerman merged commit 75f825e into main Sep 2, 2025
45 of 46 checks passed

izeigerman deleted the chore-use-raw-query-in-fingerprint branch September 2, 2025 19:30

blecourt-private mentioned this pull request Nov 3, 2025

Format change in query of parent causes child model to be classified as indirectly modified #5573

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix!: Avoid using rendered query when computing the data hash#5256

Fix!: Avoid using rendered query when computing the data hash#5256
izeigerman merged 8 commits intomainfrom
chore-use-raw-query-in-fingerprint

izeigerman commented Aug 28, 2025 •

edited

Loading

Uh oh!

tobymao Aug 28, 2025

Uh oh!

erindru Aug 28, 2025

Uh oh!

izeigerman Aug 29, 2025

Uh oh!

tobymao Aug 29, 2025

Uh oh!

tobymao Aug 29, 2025

Uh oh!

izeigerman Aug 30, 2025

Uh oh!

tobymao Aug 29, 2025

Uh oh!

izeigerman Aug 30, 2025

Uh oh!

tobymao Aug 29, 2025

Uh oh!

izeigerman Aug 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

izeigerman commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

izeigerman commented Aug 28, 2025 •

edited

Loading