planner: fix ONLY_FULL_GROUP_BY check for NATURAL JOIN columns by fixdb · Pull Request #66868 · pingcap/tidb

fixdb · 2026-03-10T06:55:40Z

What problem does this PR solve?

Issue Number: close #65387

Problem Summary:

What changed and how does it work?

When using NATURAL JOIN or JOIN ... USING, the output schema is coalesced and may not contain all original column references. The ONLY_FULL_GROUP_BY validation was only checking OutputNames() which doesn't include the coalesced-away columns from FullSchema.

This caused queries like:
SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL

to bypass the ONLY_FULL_GROUP_BY check, allowing non-deterministic results when firstrow() picked different values from parallel hash aggregation workers.

The fix checks FullNames in addition to OutputNames() when the plan is a LogicalJoin with a FullSchema, ensuring columns from NATURAL JOINs are properly validated.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No need to test
- I checked and no code files have been changed.

Side effects

Performance regression: Consumes more CPU
Performance regression: Consumes more Memory
Breaking backward compatibility

Documentation

Release note

planner: fix wrong result for ONLY_FULL_GROUP_BY check for NATURAL JOIN columns.

Summary by CodeRabbit

Tests
- Added comprehensive test coverage for NATURAL JOIN queries with ONLY_FULL_GROUP_BY mode, including validation of error handling and aggregation behavior.
Bug Fixes
- Improved column name resolution for NATURAL JOIN and USING clause scenarios to properly handle complex join schemas.

When using NATURAL JOIN or JOIN ... USING, the output schema is coalesced and may not contain all original column references. The ONLY_FULL_GROUP_BY validation was only checking OutputNames() which doesn't include the coalesced-away columns from FullSchema. This caused queries like: SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL to bypass the ONLY_FULL_GROUP_BY check, allowing non-deterministic results when firstrow() picked different values from parallel hash aggregation workers. The fix checks FullNames in addition to OutputNames() when the plan is a LogicalJoin with a FullSchema, ensuring columns from NATURAL JOINs are properly validated.

pantheon-ai · 2026-03-10T06:55:47Z

❌ Failed to Create Project

We encountered an error while creating your project. Please try again or contact support if the issue persists.

ti-chi-bot · 2026-03-10T06:55:52Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign qw4990 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/planner/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tiprow · 2026-03-10T06:55:58Z

Hi @fixdb. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

hawkingrei · 2026-03-10T06:55:59Z

/ok-to-test

coderabbitai · 2026-03-10T06:56:41Z

📝 Walkthrough

Walkthrough

This change fixes undeterministic query results for NATURAL JOIN operations under ONLY_FULL_GROUP_BY constraints by adding fallback column name resolution logic and corresponding test coverage for edge cases involving coalesced join columns.

Changes

Cohort / File(s)	Summary
Test Coverage for NATURAL JOIN Edge Cases `pkg/planner/core/issuetest/planner_issue_test.go`	Added test blocks verifying ONLY_FULL_GROUP_BY behavior with NATURAL JOIN, including error cases for coalesced column selection, aggregated query success paths, and normal join operations.
Column Resolution Fallback for NATURAL/USING Joins `pkg/planner/core/logical_plan_builder.go`	Implemented fallback resolution in colNameResolver.Leave to check join.FullNames when columns are not found in current output names for coalesced NATURAL JOIN / USING join scenarios.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

PR #66273: Implements redundant-column remapping/resolution for USING/NATURAL joins with overlapping column-resolution behavior modifications for join-based name handling.

Suggested labels

lgtm, AI-Testing

Suggested reviewers

winoros
AilinKid
guo-shaoge

Poem

🐰 A natural join that danced with flair,
With columns coalesced here and there,
Now resolution finds its place,
No more undeterministic grace—
Tests and logic hand in hand,
Make queries stable, safe, and grand! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: fixing ONLY_FULL_GROUP_BY validation for NATURAL JOIN columns.
Description check	✅ Passed	The description includes issue reference, clear problem summary, explanation of the fix, test coverage checkbox marked, and release note.
Linked Issues check	✅ Passed	Code changes [logical_plan_builder.go] implement the fix by checking FullNames for NATURAL JOIN columns, and test changes [planner_issue_test.go] verify ONLY_FULL_GROUP_BY behavior with NATURAL JOIN, addressing issue `#65387`.
Out of Scope Changes check	✅ Passed	All changes are focused on fixing ONLY_FULL_GROUP_BY validation for NATURAL JOIN columns; no unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Command failed

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ti-chi-bot · 2026-03-10T07:07:02Z

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-tests-checked label, please finished the tests then check the finished items in description.

For example:

Tests

Unit test

Integration test

Manual test (add detailed scripts or steps below)

No code

_{📖 For more info, you can check the "Contribute Code" section in the development guide.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/planner/core/issuetest/planner_issue_test.go`:
- Around line 149-171: Add a sibling regression that mirrors the NATURAL JOIN
cases but uses JOIN ... USING(c0): inside the same test block (the one that
calls prepareSharedTestKit and creates t0/t1), add queries that assert the USING
path behaves the same — e.g. use tk.MustContainErrMsg for "select c0 from t0
right join t1 using(c0) group by null" and for "select t0.c0 from t0 right join
t1 using(c0) group by null" with the same ONLY_FULL_GROUP_BY error messages,
confirm aggregation works with "select max(t0.c0) from t0 right join t1
using(c0) group by null" and confirm the normal non-GROUP BY query "select
t0.c0, t1.c0 from t0 right join t1 using(c0) order by t1.c0" returns the
expected rows; place these next to the existing NATURAL JOIN assertions so the
coalesced-column path for USING is covered.

In `@pkg/planner/core/logical_plan_builder.go`:
- Around line 3500-3506: addGbyOrSingleValueColName() currently only checks
p.OutputNames() to mark grouped/single-value columns, so when the parent is a
logicalop.LogicalJoin with coalesced OutputNames it misses columns present in
join.FullNames; update addGbyOrSingleValueColName() to mirror the fallback used
elsewhere: if c.p is a *logicalop.LogicalJoin and join.FullNames != nil, call
expression.FindFieldName(join.FullNames, v.Name) (same way as in the referenced
block) and, when found, record the corresponding join.FullNames[index] into the
grouped-name set (the same map where OutputNames results are stored) so GROUP BY
columns that only appear in FullNames are recognized as grouped.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 694f01ee-34b9-4147-9073-a0473cc3a630

📥 Commits

Reviewing files that changed from the base of the PR and between 88db262 and 4d660a2.

📒 Files selected for processing (2)

pkg/planner/core/issuetest/planner_issue_test.go
pkg/planner/core/logical_plan_builder.go

coderabbitai · 2026-03-10T07:12:05Z

pkg/planner/core/issuetest/planner_issue_test.go

+	// Test ONLY_FULL_GROUP_BY with NATURAL JOIN.
+	// For NATURAL JOIN, the columns are coalesced in the output schema,
+	// but we should still detect non-aggregated columns from the FullSchema.
+	// only-full-group-by-natural-join
+	{
+		tk := prepareSharedTestKit(t)
+		tk.MustExec("create table t0(c0 int)")
+		tk.MustExec("create table t1(c0 int)")
+		tk.MustExec("insert into t0 values (1), (2)")
+		tk.MustExec("insert into t1 values (1), (3)")
+		tk.MustExec("set @@sql_mode = default")
+		tk.MustQuery("select @@sql_mode REGEXP 'ONLY_FULL_GROUP_BY'").Check(testkit.Rows("1"))
+		// Select the coalesced column should error
+		tk.MustContainErrMsg("select c0 from t0 natural right join t1 group by null",
+			"[planner:1055]Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t1.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by")
+		// Select the explicit t0.c0 column should also error (was a bug before fix)
+		tk.MustContainErrMsg("select t0.c0 from t0 natural right join t1 group by null",
+			"[planner:1055]Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t0.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by")
+		// Using aggregation should work
+		tk.MustQuery("select max(t0.c0) from t0 natural right join t1 group by null").Check(testkit.Rows("1"))
+		// Normal query without GROUP BY should work
+		tk.MustQuery("select t0.c0, t1.c0 from t0 natural right join t1 order by t1.c0").Check(testkit.Rows("1 1", "<nil> 3"))
+	}


⚠️ Potential issue | 🟡 Minor

Add a JOIN ... USING regression alongside this NATURAL JOIN case.

The planner fix and PR scope cover both NATURAL JOIN and JOIN ... USING, but this block only protects the NATURAL JOIN syntax. A small sibling case like ... JOIN t1 USING(c0) GROUP BY NULL would keep the shared coalesced-column path covered too.

As per coding guidelines, "**/*_test.go: MUST add a regression test and verify it fails before fix and passes after fix for bug fixes".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/issuetest/planner_issue_test.go` around lines 149 - 171, Add a sibling regression that mirrors the NATURAL JOIN cases but uses JOIN ... USING(c0): inside the same test block (the one that calls prepareSharedTestKit and creates t0/t1), add queries that assert the USING path behaves the same — e.g. use tk.MustContainErrMsg for "select c0 from t0 right join t1 using(c0) group by null" and for "select t0.c0 from t0 right join t1 using(c0) group by null" with the same ONLY_FULL_GROUP_BY error messages, confirm aggregation works with "select max(t0.c0) from t0 right join t1 using(c0) group by null" and confirm the normal non-GROUP BY query "select t0.c0, t1.c0 from t0 right join t1 using(c0) order by t1.c0" returns the expected rows; place these next to the existing NATURAL JOIN assertions so the coalesced-column path for USING is covered.

coderabbitai · 2026-03-10T07:12:05Z

pkg/planner/core/logical_plan_builder.go

+		} else if join, isJoin := c.p.(*logicalop.LogicalJoin); isJoin && join.FullSchema != nil {
+			// For NATURAL JOIN or JOIN ... USING, the output schema is coalesced and
+			// may not contain all original columns. We need to check FullNames as well.
+			idx, err = expression.FindFieldName(join.FullNames, v.Name)
+			if err == nil && idx >= 0 {
+				c.names[join.FullNames[idx]] = struct{}{}
+			}


⚠️ Potential issue | 🟠 Major

Mirror this FullNames fallback when recording grouped columns too.

This only fixes the “referenced column” side of the old ONLY_FULL_GROUP_BY checker. addGbyOrSingleValueColName() still resolves GROUP BY / single-value columns through p.OutputNames() only, so valid queries like SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY t0.c0 will now be rejected: t0.c0 is found here, but never recognized as grouped.

Suggested follow-up

func addGbyOrSingleValueColName(p base.LogicalPlan, colName *ast.ColumnName, gbyOrSingleValueColNames map[*types.FieldName]struct{}) { idx, err := expression.FindFieldName(p.OutputNames(), colName) - if err != nil || idx < 0 { - return - } - gbyOrSingleValueColNames[p.OutputNames()[idx]] = struct{}{} + if err == nil && idx >= 0 { + gbyOrSingleValueColNames[p.OutputNames()[idx]] = struct{}{} + return + } + if join, ok := p.(*logicalop.LogicalJoin); ok && join.FullSchema != nil { + idx, err = expression.FindFieldName(join.FullNames, colName) + if err == nil && idx >= 0 { + gbyOrSingleValueColNames[join.FullNames[idx]] = struct{}{} + } + } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/logical_plan_builder.go` around lines 3500 - 3506, addGbyOrSingleValueColName() currently only checks p.OutputNames() to mark grouped/single-value columns, so when the parent is a logicalop.LogicalJoin with coalesced OutputNames it misses columns present in join.FullNames; update addGbyOrSingleValueColName() to mirror the fallback used elsewhere: if c.p is a *logicalop.LogicalJoin and join.FullNames != nil, call expression.FindFieldName(join.FullNames, v.Name) (same way as in the referenced block) and, when found, record the corresponding join.FullNames[index] into the grouped-name set (the same map where OutputNames results are stored) so GROUP BY columns that only appear in FullNames are recognized as grouped.

tiprow · 2026-03-10T07:15:13Z

@fixdb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
fast_test_tiprow	`4d660a2`	link	true	`/test fast_test_tiprow`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ti-chi-bot · 2026-03-10T07:54:38Z

@fixdb: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
idc-jenkins-ci-tidb/check_dev_2	`4d660a2`	link	true	`/test check-dev2`
idc-jenkins-ci-tidb/mysql-test	`4d660a2`	link	true	`/test mysql-test`
idc-jenkins-ci-tidb/check_dev	`4d660a2`	link	true	`/test check-dev`
pull-integration-e2e-test	`4d660a2`	link	true	`/test pull-integration-e2e-test`
idc-jenkins-ci-tidb/unit-test	`4d660a2`	link	true	`/test unit-test`
pull-unit-test-next-gen	`4d660a2`	link	true	`/test pull-unit-test-next-gen`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ti-chi-bot bot added do-not-merge/needs-triage-completed release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-tests-checked labels Mar 10, 2026

ti-chi-bot bot added sig/planner SIG: Planner size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 10, 2026

ti-chi-bot bot added the ok-to-test Indicates a PR is ready to be tested. label Mar 10, 2026

ti-chi-bot bot removed the do-not-merge/needs-triage-completed label Mar 10, 2026

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

Conversation

fixdb commented Mar 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

pantheon-ai bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

tiprow bot commented Mar 10, 2026

Uh oh!

hawkingrei commented Mar 10, 2026

Uh oh!

coderabbitai bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

tiprow bot commented Mar 10, 2026

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fixdb commented Mar 10, 2026 •

edited by coderabbitai bot

Loading

pantheon-ai bot commented Mar 10, 2026 •

edited

Loading

coderabbitai bot commented Mar 10, 2026 •

edited

Loading