planner: fix correlated subquery resolution failure in window-containing expressions#66814
planner: fix correlated subquery resolution failure in window-containing expressions#66814qw4990 wants to merge 2 commits intopingcap:masterfrom
Conversation
|
Review Complete Findings: 1 issues ℹ️ Learn more details on Pantheon AI. |
|
Hi @qw4990. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughPreserves child plan columns during a two-pass projection build so correlated subqueries inside window-function-containing expressions can resolve outer references; adds a test covering correlated EXISTS with window functions and increases test sharding. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/planner/core/casetest/correlated/correlated_test.go (1)
74-107: Reuse the existing cascades test harness here.This regression is in shared planner construction code, so wrapping it in
testkit.RunTestUnderCascades(...)likeTestCorrelatedSubquerywould keep both planner modes covered by the same fixture.♻️ Suggested shape
func TestCorrelatedExistsWithWindowFunction(t *testing.T) { - store := testkit.CreateMockStore(t) - tk := testkit.NewTestKit(t, store) - tk.MustExec("use test") - tk.MustExec("create table t1 (c1 int)") - ... + testkit.RunTestUnderCascades(t, func(t *testing.T, tk *testkit.TestKit, _, _ string) { + tk.MustExec("use test") + tk.MustExec("create table t1 (c1 int)") + ... + }) }As per coding guidelines "Prefer extending existing test suites and fixtures over creating new scaffolding".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/casetest/correlated/correlated_test.go` around lines 74 - 107, TestCorrelatedExistsWithWindowFunction should be executed under both planners using the cascades test harness: wrap the existing test logic inside testkit.RunTestUnderCascades(...) so the same body runs under both planner modes (similar to how TestCorrelatedSubquery is exercised); locate the test function TestCorrelatedExistsWithWindowFunction and replace its direct use of testkit.NewTestKit/tk.MustExec/tk.MustQuery calls with a closure passed to testkit.RunTestUnderCascades that receives a store and runs the same assertions, ensuring the test fixture and statements (including table creation, inserts, and queries) are executed inside that closure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/planner/core/casetest/correlated/correlated_test.go`:
- Around line 74-107: TestCorrelatedExistsWithWindowFunction should be executed
under both planners using the cascades test harness: wrap the existing test
logic inside testkit.RunTestUnderCascades(...) so the same body runs under both
planner modes (similar to how TestCorrelatedSubquery is exercised); locate the
test function TestCorrelatedExistsWithWindowFunction and replace its direct use
of testkit.NewTestKit/tk.MustExec/tk.MustQuery calls with a closure passed to
testkit.RunTestUnderCascades that receives a store and runs the same assertions,
ensuring the test fixture and statements (including table creation, inserts, and
queries) are executed inside that closure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9051ac2c-fa63-4710-b198-6d25679ef583
📒 Files selected for processing (3)
pkg/planner/core/casetest/correlated/BUILD.bazelpkg/planner/core/casetest/correlated/correlated_test.gopkg/planner/core/logical_plan_builder.go
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #66814 +/- ##
================================================
- Coverage 77.6702% 77.5084% -0.1619%
================================================
Files 2009 1930 -79
Lines 549988 538415 -11573
================================================
- Hits 427177 417317 -9860
+ Misses 121102 121096 -6
+ Partials 1709 2 -1707
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| // processes those fields, so correlated subqueries inside them can still resolve | ||
| // references to the outer query's columns. Column pruning removes any that end | ||
| // up unused. | ||
| if !considerWindow && hasWindowField { |
There was a problem hiding this comment.
[P1] Aliased outer columns can still fail correlated resolution in the deferred window pass
Why: The carry-through logic deduplicates by UniqueID via schema.Contains(col), so if a child column is already projected under an alias (e.g. a AS x), the original FieldName for a is not preserved in the projection output. Correlated subquery resolution is purely name-based via expression.FindFieldName(outerName, v), which matches only FieldName.ColName — not OrigColName. This means a query like:
SELECT a AS x, row_number() OVER () + (SELECT a FROM dual) FROM t;can fail at planning with [planner:1054] Unknown column 'a' in 'field list', even though this PR is intended to fix exactly this class of correlated subquery + window function errors.
Evidence:
if !schema.Contains(col)dedup atpkg/planner/core/logical_plan_builder.go:1386usesUniqueID-based containment (pkg/expression/schema.go:193), blocking the alias-original-name carry-through.- Name-based lookup at
pkg/planner/core/expression_rewriter.go:2619:idx, err = expression.FindFieldName(outerName, v)only matchesFieldName.ColName, which is set to the alias bypkg/planner/core/logical_plan_builder.go:1000.
There was a problem hiding this comment.
Thanks for addressing this! Once the fix is pushed, feel free to re-request review and I'll verify the aliased-column resolution path is covered.
ℹ️ Learn more details on Pantheon AI.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/logical_plan_builder.go`:
- Around line 1409-1412: The cloned column created via
col.Clone().(*expression.Column) has its UniqueID reassigned via
b.ctx.GetSessionVars().AllocPlanColumnID() but its cached hash is not
invalidated; call the column's CleanHashCode() after resetting UniqueID to clear
the old hash cache so downstream hash-based operations and deduplication work
correctly (follow the same pattern as in getResultCTESchema). Locate the
clone/UniqueID reassignment in the block that appends to proj.Exprs and
schema.Append and add the CleanHashCode() call on the cloned *expression.Column
immediately after setting UniqueID.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f2a26ff7-2121-47f6-bf5d-5727bfd35a35
📒 Files selected for processing (2)
pkg/planner/core/casetest/correlated/correlated_test.gopkg/planner/core/logical_plan_builder.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/planner/core/casetest/correlated/correlated_test.go
| cloned := col.Clone().(*expression.Column) | ||
| cloned.UniqueID = b.ctx.GetSessionVars().AllocPlanColumnID() | ||
| proj.Exprs = append(proj.Exprs, col) | ||
| schema.Append(cloned) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== expression.Column.Clone implementation ==\n'
rg -n -C6 'func .*Column.*Clone\(' pkg/expression
printf '\n== Existing hash-cache invalidation sites ==\n'
rg -n -C3 'CleanHashCode\(' pkg/planner/core/logical_plan_builder.go pkg/expression
printf '\n== UniqueID reassignment after cloning in this file ==\n'
rg -n -C3 'Clone\(\)|UniqueID = .*AllocPlanColumnID\(' pkg/planner/core/logical_plan_builder.go
Repository: pingcap/tidb
Length of output: 11304
Add CleanHashCode() after reassigning UniqueID to maintain hash consistency.
Line 1410 changes UniqueID on a cloned column without invalidating its hash cache. The Clone() method preserves the cached hashcode, and your file already establishes this pattern in getResultCTESchema (line 7413) where the comment states: "Since you have reallocated unique id here, the old-cloned-cached hash code is not valid anymore." Inconsistency here can cause the clone to hash identically to the original, defeating deduplication or leading to key collisions in hash-based lookups.
Suggested fix
cloned := col.Clone().(*expression.Column)
cloned.UniqueID = b.ctx.GetSessionVars().AllocPlanColumnID()
+ cloned.CleanHashCode()
proj.Exprs = append(proj.Exprs, col)
schema.Append(cloned)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cloned := col.Clone().(*expression.Column) | |
| cloned.UniqueID = b.ctx.GetSessionVars().AllocPlanColumnID() | |
| proj.Exprs = append(proj.Exprs, col) | |
| schema.Append(cloned) | |
| cloned := col.Clone().(*expression.Column) | |
| cloned.UniqueID = b.ctx.GetSessionVars().AllocPlanColumnID() | |
| cloned.CleanHashCode() | |
| proj.Exprs = append(proj.Exprs, col) | |
| schema.Append(cloned) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/planner/core/logical_plan_builder.go` around lines 1409 - 1412, The
cloned column created via col.Clone().(*expression.Column) has its UniqueID
reassigned via b.ctx.GetSessionVars().AllocPlanColumnID() but its cached hash is
not invalidated; call the column's CleanHashCode() after resetting UniqueID to
clear the old hash cache so downstream hash-based operations and deduplication
work correctly (follow the same pattern as in getResultCTESchema). Locate the
clone/UniqueID reassignment in the block that appends to proj.Exprs and
schema.Append and add the CleanHashCode() call on the cloned *expression.Column
immediately after setting UniqueID.
|
/test unit-test |
|
@qw4990: PRs from untrusted users cannot be marked as trusted with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: guo-shaoge, hawkingrei The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
Duplicated with #66869 |
What problem does this PR solve?
Issue Number: close #57594
Problem Summary: planner: fix correlated subquery resolution failure in window-containing expressions
What changed and how does it work?
Root cause
buildProjectionis invoked twice when window functions are present:considerWindow=false): fields containing window functions are entirely skipped with a zero placeholder — correlated subqueries inside them are not processed.buildWindowFunctionsthen builds aLogicalWindowon top, whose schema only retains the placeholder columns. The original table columns are lost.considerWindow=true): the full expression (including the correlated subquery) is now processed, but the plan's schema no longer contains the original table columns, so correlated column resolution fails.Solution
In the first
buildProjectionpass, when window-containing fields are deferred, also carry through the child plan's columns into the projection output. This preserves them through toLogicalWindowso they remain visible for correlated subquery resolution in the second pass. Column pruning removes any that are ultimately unused.What changed
pkg/planner/core/logical_plan_builder.go: After the field loop inbuildProjection, when!considerWindowand there are window fields, append the child plan's columns that aren't already in the projection schema.pkg/planner/core/casetest/correlated/correlated_test.go: AddedTestCorrelatedExistsWithWindowFunctionregression test covering empty table, data correctness, and the original reproduction case.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
Bug Fixes
Tests
Chores