planner: fix ONLY_FULL_GROUP_BY check for NATURAL JOIN columns#66868
planner: fix ONLY_FULL_GROUP_BY check for NATURAL JOIN columns#66868fixdb wants to merge 1 commit intopingcap:masterfrom
Conversation
When using NATURAL JOIN or JOIN ... USING, the output schema is coalesced and may not contain all original column references. The ONLY_FULL_GROUP_BY validation was only checking OutputNames() which doesn't include the coalesced-away columns from FullSchema. This caused queries like: SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL to bypass the ONLY_FULL_GROUP_BY check, allowing non-deterministic results when firstrow() picked different values from parallel hash aggregation workers. The fix checks FullNames in addition to OutputNames() when the plan is a LogicalJoin with a FullSchema, ensuring columns from NATURAL JOINs are properly validated.
|
❌ Failed to Create Project We encountered an error while creating your project. Please try again or contact support if the issue persists. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @fixdb. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
📝 WalkthroughWalkthroughThis change fixes undeterministic query results for NATURAL JOIN operations under ONLY_FULL_GROUP_BY constraints by adding fallback column name resolution logic and corresponding test coverage for edge cases involving coalesced join columns. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Command failed Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[FORMAT CHECKER NOTIFICATION] Notice: To remove the For example:
📖 For more info, you can check the "Contribute Code" section in the development guide. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/issuetest/planner_issue_test.go`:
- Around line 149-171: Add a sibling regression that mirrors the NATURAL JOIN
cases but uses JOIN ... USING(c0): inside the same test block (the one that
calls prepareSharedTestKit and creates t0/t1), add queries that assert the USING
path behaves the same — e.g. use tk.MustContainErrMsg for "select c0 from t0
right join t1 using(c0) group by null" and for "select t0.c0 from t0 right join
t1 using(c0) group by null" with the same ONLY_FULL_GROUP_BY error messages,
confirm aggregation works with "select max(t0.c0) from t0 right join t1
using(c0) group by null" and confirm the normal non-GROUP BY query "select
t0.c0, t1.c0 from t0 right join t1 using(c0) order by t1.c0" returns the
expected rows; place these next to the existing NATURAL JOIN assertions so the
coalesced-column path for USING is covered.
In `@pkg/planner/core/logical_plan_builder.go`:
- Around line 3500-3506: addGbyOrSingleValueColName() currently only checks
p.OutputNames() to mark grouped/single-value columns, so when the parent is a
logicalop.LogicalJoin with coalesced OutputNames it misses columns present in
join.FullNames; update addGbyOrSingleValueColName() to mirror the fallback used
elsewhere: if c.p is a *logicalop.LogicalJoin and join.FullNames != nil, call
expression.FindFieldName(join.FullNames, v.Name) (same way as in the referenced
block) and, when found, record the corresponding join.FullNames[index] into the
grouped-name set (the same map where OutputNames results are stored) so GROUP BY
columns that only appear in FullNames are recognized as grouped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 694f01ee-34b9-4147-9073-a0473cc3a630
📒 Files selected for processing (2)
pkg/planner/core/issuetest/planner_issue_test.gopkg/planner/core/logical_plan_builder.go
| // Test ONLY_FULL_GROUP_BY with NATURAL JOIN. | ||
| // For NATURAL JOIN, the columns are coalesced in the output schema, | ||
| // but we should still detect non-aggregated columns from the FullSchema. | ||
| // only-full-group-by-natural-join | ||
| { | ||
| tk := prepareSharedTestKit(t) | ||
| tk.MustExec("create table t0(c0 int)") | ||
| tk.MustExec("create table t1(c0 int)") | ||
| tk.MustExec("insert into t0 values (1), (2)") | ||
| tk.MustExec("insert into t1 values (1), (3)") | ||
| tk.MustExec("set @@sql_mode = default") | ||
| tk.MustQuery("select @@sql_mode REGEXP 'ONLY_FULL_GROUP_BY'").Check(testkit.Rows("1")) | ||
| // Select the coalesced column should error | ||
| tk.MustContainErrMsg("select c0 from t0 natural right join t1 group by null", | ||
| "[planner:1055]Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t1.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by") | ||
| // Select the explicit t0.c0 column should also error (was a bug before fix) | ||
| tk.MustContainErrMsg("select t0.c0 from t0 natural right join t1 group by null", | ||
| "[planner:1055]Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'test.t0.c0' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by") | ||
| // Using aggregation should work | ||
| tk.MustQuery("select max(t0.c0) from t0 natural right join t1 group by null").Check(testkit.Rows("1")) | ||
| // Normal query without GROUP BY should work | ||
| tk.MustQuery("select t0.c0, t1.c0 from t0 natural right join t1 order by t1.c0").Check(testkit.Rows("1 1", "<nil> 3")) | ||
| } |
There was a problem hiding this comment.
Add a JOIN ... USING regression alongside this NATURAL JOIN case.
The planner fix and PR scope cover both NATURAL JOIN and JOIN ... USING, but this block only protects the NATURAL JOIN syntax. A small sibling case like ... JOIN t1 USING(c0) GROUP BY NULL would keep the shared coalesced-column path covered too.
As per coding guidelines, "**/*_test.go: MUST add a regression test and verify it fails before fix and passes after fix for bug fixes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/planner/core/issuetest/planner_issue_test.go` around lines 149 - 171, Add
a sibling regression that mirrors the NATURAL JOIN cases but uses JOIN ...
USING(c0): inside the same test block (the one that calls prepareSharedTestKit
and creates t0/t1), add queries that assert the USING path behaves the same —
e.g. use tk.MustContainErrMsg for "select c0 from t0 right join t1 using(c0)
group by null" and for "select t0.c0 from t0 right join t1 using(c0) group by
null" with the same ONLY_FULL_GROUP_BY error messages, confirm aggregation works
with "select max(t0.c0) from t0 right join t1 using(c0) group by null" and
confirm the normal non-GROUP BY query "select t0.c0, t1.c0 from t0 right join t1
using(c0) order by t1.c0" returns the expected rows; place these next to the
existing NATURAL JOIN assertions so the coalesced-column path for USING is
covered.
| } else if join, isJoin := c.p.(*logicalop.LogicalJoin); isJoin && join.FullSchema != nil { | ||
| // For NATURAL JOIN or JOIN ... USING, the output schema is coalesced and | ||
| // may not contain all original columns. We need to check FullNames as well. | ||
| idx, err = expression.FindFieldName(join.FullNames, v.Name) | ||
| if err == nil && idx >= 0 { | ||
| c.names[join.FullNames[idx]] = struct{}{} | ||
| } |
There was a problem hiding this comment.
Mirror this FullNames fallback when recording grouped columns too.
This only fixes the “referenced column” side of the old ONLY_FULL_GROUP_BY checker. addGbyOrSingleValueColName() still resolves GROUP BY / single-value columns through p.OutputNames() only, so valid queries like SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY t0.c0 will now be rejected: t0.c0 is found here, but never recognized as grouped.
Suggested follow-up
func addGbyOrSingleValueColName(p base.LogicalPlan, colName *ast.ColumnName, gbyOrSingleValueColNames map[*types.FieldName]struct{}) {
idx, err := expression.FindFieldName(p.OutputNames(), colName)
- if err != nil || idx < 0 {
- return
- }
- gbyOrSingleValueColNames[p.OutputNames()[idx]] = struct{}{}
+ if err == nil && idx >= 0 {
+ gbyOrSingleValueColNames[p.OutputNames()[idx]] = struct{}{}
+ return
+ }
+ if join, ok := p.(*logicalop.LogicalJoin); ok && join.FullSchema != nil {
+ idx, err = expression.FindFieldName(join.FullNames, colName)
+ if err == nil && idx >= 0 {
+ gbyOrSingleValueColNames[join.FullNames[idx]] = struct{}{}
+ }
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/planner/core/logical_plan_builder.go` around lines 3500 - 3506,
addGbyOrSingleValueColName() currently only checks p.OutputNames() to mark
grouped/single-value columns, so when the parent is a logicalop.LogicalJoin with
coalesced OutputNames it misses columns present in join.FullNames; update
addGbyOrSingleValueColName() to mirror the fallback used elsewhere: if c.p is a
*logicalop.LogicalJoin and join.FullNames != nil, call
expression.FindFieldName(join.FullNames, v.Name) (same way as in the referenced
block) and, when found, record the corresponding join.FullNames[index] into the
grouped-name set (the same map where OutputNames results are stored) so GROUP BY
columns that only appear in FullNames are recognized as grouped.
|
@fixdb: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@fixdb: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #65387
Problem Summary:
What changed and how does it work?
When using NATURAL JOIN or JOIN ... USING, the output schema is coalesced and may not contain all original column references. The ONLY_FULL_GROUP_BY validation was only checking OutputNames() which doesn't include the coalesced-away columns from FullSchema.
This caused queries like:
SELECT t0.c0 FROM t0 NATURAL RIGHT JOIN t1 GROUP BY NULL
to bypass the ONLY_FULL_GROUP_BY check, allowing non-deterministic results when firstrow() picked different values from parallel hash aggregation workers.
The fix checks FullNames in addition to OutputNames() when the plan is a LogicalJoin with a FullSchema, ensuring columns from NATURAL JOINs are properly validated.
Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
Tests
Bug Fixes