planner: enable cd-c algorithm by default for join reorder | tidb-test=pr/2676#66349
planner: enable cd-c algorithm by default for join reorder | tidb-test=pr/2676#66349guo-shaoge wants to merge 39 commits intopingcap:masterfrom
Conversation
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
I've accepted your request and will start reviewing the pull request. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes.
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
remainder: add inTest check if has remaining edge |
| } | ||
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { | ||
| p, err := joinorder.Optimize(p) |
There was a problem hiding this comment.
P2: joinorder.Optimize can hit a fallback where it logs WARN and returns the original join tree when it can’t consume all predicate-carrying edges (pkg/planner/core/joinorder/join_order.go:458). With CD-C becoming default-on here, this could be triggered by complex join graphs and silently skip reordering (correctness preserved, but potential perf regressions + noisy WARNs). Consider downgrading the log / adding a metric, or providing a more deterministic fallback path.
| } | ||
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { | ||
| p, err := joinorder.Optimize(p) |
There was a problem hiding this comment.
P1 (see pkg/planner/core/joinorder/join_order.go:507): With default tidb_opt_cartesian_join_order_threshold=0 (CartesianJoinOrderThreshold), CD-C’s 2nd pass forces allowNoEQ=true and then multiplies cumCost by cartesianFactor (0), making cartesian/no-EQ joins cost 0. This can drive severe plan regressions/OOM once this path is default; please ensure the penalty never reduces cost (e.g., clamp factor >=1 / additive penalty) and/or don’t run the 2nd pass when cartesian is disabled.
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { | ||
| p, err := joinorder.Optimize(p) | ||
| return p, false, err |
There was a problem hiding this comment.
P2 (see pkg/planner/core/joinorder/join_order.go:273): tidb_opt_join_reorder_threshold is effectively ignored (threshold check commented out; useGreedy := true), making DP unreachable. This is a compatibility/operability regression for users relying on DP join reorder with threshold > 0.
| failpoint.Return(p2, false, err) | ||
| } | ||
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { |
There was a problem hiding this comment.
P2: EnableOuterJoinReorder now selects which join reorder engine runs (joinorder.Optimize vs optimizeRecursive), impacting pure-inner-join queries too. That changes the var’s operational meaning beyond “whether outer joins participate”; consider a separate session var/flag for engine selection and keep this var scoped to outer-join behavior.
|
@pantheon-ai[bot]: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
| failpoint.Return(p2, false, err) | ||
| } | ||
| }) | ||
| if p.SCtx().GetSessionVars().EnableOuterJoinReorder { |
There was a problem hiding this comment.
P2: Routing the default join-reorder through joinorder.Optimize (via EnableOuterJoinReorder) bypasses the legacy extractJoinGroup path that checks tidb_opt_join_reorder_through_sel (TiDBOptJoinReorderThroughSel, around L53). This makes that knob ineffective in default configs. Should the new joinorder path honor it (or document/rename it as legacy-only)?
|
/retest |
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces schema-aligned join condition handling and replaces failpoint-based control flow with session variable gating for outer join reordering. Core planner logic is enhanced with condition alignment functions, and test expectations across multiple suites are updated to reflect the resulting query plan changes. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/planner/core/rule_join_reorder.go (2)
295-300: Remove stale failpoint placeholder block.The commented failpoint block is dead code and now just adds confusion around the active control path.
🧹 Proposed cleanup
- // failpoint.Inject("enableCDCJoinReorder", func(val failpoint.Value) { - // if val.(bool) { - // p2, err := joinorder.Optimize(p) - // failpoint.Return(p2, false, err) - // } - // })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/rule_join_reorder.go` around lines 295 - 300, Remove the stale commented failpoint block that references failpoint.Inject("enableCDCJoinReorder") and the inner call to joinorder.Optimize(p) (variables p and p2); simply delete the entire commented-out block so the active control flow is not obscured by dead code, and ensure no other references to enableCDCJoinReorder remain in this scope.
291-303: Please attach targeted planner test evidence for this optimizer-path switch.Given this changes join reorder dispatch logic, include the exact planner and integration test commands/results in the PR thread before merge.
As per coding guidelines
pkg/planner/**: Run targeted planner unit tests (go test -run <TestName> -tags=intest,deadlock) and update rule testdata when needed for changes to planner rules or logical/physical plans.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/planner/core/rule_join_reorder.go` around lines 291 - 303, Add targeted planner unit and integration tests that exercise both branches of the join reorder dispatch (when p.SCtx().GetSessionVars().EnableOuterJoinReorder is true and false) so CI and reviewers can see exact planner outputs; create/modify tests (e.g., in a new or existing pkg/planner/*_test.go such as rule_join_reorder_test.go) that invoke the planner on representative queries and assert or capture the produced logical/physical plans for the Optimize and optimizeRecursive paths, update any rule testdata expected outputs under pkg/planner/testdata as needed, and run the commands to produce evidence: go test -run <TestName> -tags=intest,deadlock (and the relevant unit test runs) then paste the exact command lines and their outputs (planner and integration results) into the PR thread for review.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/planner/core/rule_join_reorder.go`:
- Around line 295-300: Remove the stale commented failpoint block that
references failpoint.Inject("enableCDCJoinReorder") and the inner call to
joinorder.Optimize(p) (variables p and p2); simply delete the entire
commented-out block so the active control flow is not obscured by dead code, and
ensure no other references to enableCDCJoinReorder remain in this scope.
- Around line 291-303: Add targeted planner unit and integration tests that
exercise both branches of the join reorder dispatch (when
p.SCtx().GetSessionVars().EnableOuterJoinReorder is true and false) so CI and
reviewers can see exact planner outputs; create/modify tests (e.g., in a new or
existing pkg/planner/*_test.go such as rule_join_reorder_test.go) that invoke
the planner on representative queries and assert or capture the produced
logical/physical plans for the Optimize and optimizeRecursive paths, update any
rule testdata expected outputs under pkg/planner/testdata as needed, and run the
commands to produce evidence: go test -run <TestName> -tags=intest,deadlock (and
the relevant unit test runs) then paste the exact command lines and their
outputs (planner and integration results) into the PR thread for review.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (26)
pkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_out.jsonpkg/planner/core/casetest/binaryplan/testdata/binary_plan_suite_xut.jsonpkg/planner/core/casetest/cbotest/testdata/analyze_suite_out.jsonpkg/planner/core/casetest/cbotest/testdata/analyze_suite_xut.jsonpkg/planner/core/casetest/enforcempp/BUILD.bazelpkg/planner/core/casetest/enforcempp/enforce_mpp_test.gopkg/planner/core/casetest/hint/testdata/integration_suite_out.jsonpkg/planner/core/casetest/hint/testdata/integration_suite_xut.jsonpkg/planner/core/casetest/mpp/testdata/integration_suite_out.jsonpkg/planner/core/casetest/mpp/testdata/integration_suite_xut.jsonpkg/planner/core/casetest/rule/rule_outer2inner_test.gopkg/planner/core/casetest/rule/testdata/join_reorder_suite_out.jsonpkg/planner/core/casetest/rule/testdata/outer2inner_out.jsonpkg/planner/core/casetest/rule/testdata/outer2inner_xut.jsonpkg/planner/core/casetest/testdata/json_plan_suite_out.jsonpkg/planner/core/casetest/testdata/json_plan_suite_xut.jsonpkg/planner/core/casetest/tpcds/testdata/tpcds_suite_out.jsonpkg/planner/core/casetest/tpcds/testdata/tpcds_suite_xut.jsonpkg/planner/core/rule_join_reorder.gotests/integrationtest/r/planner/core/casetest/index/index.resulttests/integrationtest/r/planner/core/casetest/rule/rule_join_reorder.resulttests/integrationtest/r/planner/core/issuetest/planner_issue.resulttests/integrationtest/r/planner/core/join_reorder2.resulttests/integrationtest/r/planner/core/plan_cache.resulttests/integrationtest/r/planner/core/rule_join_reorder.resulttests/integrationtest/r/select.result
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integrationtest/r/tpch.result`:
- Line 447: The EXPLAIN predicate printer is dropping relation/alias qualifiers,
producing unsatisfiable text like eq(tpch50.nation.n_name, "JAPAN") for joined
instances; update the predicate/expr formatter (e.g., the function that formats
scalar expressions / column references used by the HashJoin explain path) to
include the per-source alias or qualifier (the join input alias such as n1/n2)
when rendering ColumnRef nodes so the predicate shows eq(n1.n_name, "JAPAN") and
eq(n2.n_name, "INDIA"); locate the formatter used by HashJoin explain output and
change it to prefer the relation alias (or join alias) over the fully qualified
catalog.table name when an alias is present.
| ├─TableReader(Build) root data:TableFullScan | ||
| │ └─TableFullScan cop[tikv] table:customer keep order:false | ||
| └─HashJoin(Probe) root inner join, equal:[eq(tpch50.lineitem.l_orderkey, tpch50.orders.o_orderkey)] | ||
| └─HashJoin root inner join, equal:[eq(tpch50.customer.c_nationkey, tpch50.nation.n_nationkey)], other cond:or(and(eq(tpch50.nation.n_name, "JAPAN"), eq(tpch50.nation.n_name, "INDIA")), and(eq(tpch50.nation.n_name, "INDIA"), eq(tpch50.nation.n_name, "JAPAN"))) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the tpch.result file
cd /repo 2>/dev/null || cd .
fd -t f "tpch.result" | head -5Repository: pingcap/tidb
Length of output: 91
🏁 Script executed:
# Get context around line 447 - show lines 430-470 to see full segment
if [ -f "tests/integrationtest/r/tpch.result" ]; then
sed -n '430,470p' tests/integrationtest/r/tpch.result | cat -n
fiRepository: pingcap/tidb
Length of output: 3462
🏁 Script executed:
# Find Q7 SQL query - typically before the EXPLAIN output
# Look backwards from line 447 for the SQL query
if [ -f "tests/integrationtest/r/tpch.result" ]; then
sed -n '1,450p' tests/integrationtest/r/tpch.result | grep -n "Q7" -A 30 | head -50
fiRepository: pingcap/tidb
Length of output: 1152
🏁 Script executed:
# Search for the Q7 SQL definition in the codebase
rg "SELECT.*nation.*n1.*nation.*n2" -A 5 -B 5 2>/dev/null | head -100Repository: pingcap/tidb
Length of output: 38
🏁 Script executed:
# Look for TPCHqueries or test data defining Q7
rg "from.*nation.*n1.*from.*nation.*n2\|from.*nation.*n1.*,.*nation.*n2" 2>/dev/null | head -20Repository: pingcap/tidb
Length of output: 38
EXPLAIN output loses alias distinction in predicate rendering.
Line 447 displays other cond:or(and(eq(tpch50.nation.n_name, "JAPAN"), eq(tpch50.nation.n_name, "INDIA")), ...), which appears textually unsatisfiable. While the planner correctly executes the self-join via distinct table:n1 and table:n2 accesses, the predicate rendering should preserve alias information to clarify that the comparisons apply to different nation instances (e.g., eq(n1.n_name, "JAPAN"), eq(n2.n_name, "INDIA")). The current formatting masks the actual join semantics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integrationtest/r/tpch.result` at line 447, The EXPLAIN predicate
printer is dropping relation/alias qualifiers, producing unsatisfiable text like
eq(tpch50.nation.n_name, "JAPAN") for joined instances; update the
predicate/expr formatter (e.g., the function that formats scalar expressions /
column references used by the HashJoin explain path) to include the per-source
alias or qualifier (the join input alias such as n1/n2) when rendering ColumnRef
nodes so the predicate shows eq(n1.n_name, "JAPAN") and eq(n2.n_name, "INDIA");
locate the formatter used by HashJoin explain output and change it to prefer the
relation alias (or join alias) over the fully qualified catalog.table name when
an alias is present.
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #66349 +/- ##
================================================
- Coverage 77.6833% 77.0531% -0.6303%
================================================
Files 2008 1930 -78
Lines 549801 539861 -9940
================================================
- Hits 427104 415980 -11124
- Misses 120989 123863 +2874
+ Partials 1708 18 -1690
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
Review Complete Findings: 0 issues ℹ️ Learn more details on Pantheon AI. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/planner/core/joinorder/conflict_detector.go`:
- Around line 785-802: The cloned ScalarFunction's cached hash becomes stale
after you mutate its arguments in the block handling *expression.ScalarFunction
(where you call e.Clone(), assign newFunc.GetArgs()[i] = newArgs[i] after
alignExprNotNullWithSchema), so update the code to invalidate/recompute the hash
cache: after the mutation loop call expression.ReHashCode(newFunc) (or rebuild
the function via expression.NewFunctionInternal(...) instead of mutating the
clone) so the canonicalhashcode reflects the new arguments.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
pkg/planner/core/casetest/rule/testdata/outer_to_semi_join_suite_out.jsonpkg/planner/core/casetest/rule/testdata/outer_to_semi_join_suite_xut.jsonpkg/planner/core/casetest/tpch/testdata/tpch_suite_out.jsonpkg/planner/core/casetest/tpch/testdata/tpch_suite_xut.jsonpkg/planner/core/joinorder/conflict_detector.go
| case *expression.ScalarFunction: | ||
| args := e.GetArgs() | ||
| changed := false | ||
| newArgs := make([]expression.Expression, len(args)) | ||
| for i := range args { | ||
| newArgs[i] = alignExprNotNullWithSchema(args[i], schema) | ||
| if newArgs[i] != args[i] { | ||
| changed = true | ||
| } | ||
| } | ||
| if !changed { | ||
| return e | ||
| } | ||
| newFunc := e.Clone().(*expression.ScalarFunction) | ||
| for i := range newArgs { | ||
| newFunc.GetArgs()[i] = newArgs[i] | ||
| } | ||
| return newFunc |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect ScalarFunction struct definition and Clone() implementation
echo "=== ScalarFunction struct and Clone() implementation ==="
rg -n -A20 'type ScalarFunction struct' pkg/expression/scalar_function.go
echo -e "\n=== ScalarFunction Clone() method ==="
rg -n -A30 'func \(sf \*ScalarFunction\) Clone\(' pkg/expression/scalar_function.go
echo -e "\n=== ScalarFunction GetArgs() method ==="
rg -n -A15 'func \(sf \*ScalarFunction\) GetArgs\(' pkg/expression/scalar_function.go
echo -e "\n=== Hash-related fields/methods in ScalarFunction ==="
rg -n '(hashcode|canonicalhashcode)' pkg/expression/scalar_function.go
echo -e "\n=== Expression interface hashcode/canonicalhashcode ==="
rg -n -A5 'func \(sf \*ScalarFunction\) (HashCode|CanonicalHashCode)\(' pkg/expression/Repository: pingcap/tidb
Length of output: 1723
🌐 Web query:
TiDB ScalarFunction Clone() implementation arg mutation safety
💡 Result:
TiDB’s (*ScalarFunction).Clone() is arg-mutation safe (with one important caveat about RetType).
Why args are safe to mutate after Clone()
ScalarFunction.Clone() builds a new ScalarFunction and clones the underlying builtinFunc via sf.Function.Clone(). (raw.githubusercontent.com)
All builtin signatures embed baseBuiltinFunc, whose cloneFrom() deep-clones the argument expressions:
- it allocates a new
[]Expression - and appends
arg.Clone()for each original arg (raw.githubusercontent.com)
So after cloning:
- the clone’s
GetArgs()slice is independent - each argument expression node is also cloned
- replacing/rewriting args in the clone (e.g.
clone.GetArgs()[i] = newExpr) will not affect the original
Caveat: RetType is shallow-copied
Both ScalarFunction.RetType and baseBuiltinFunc.tp are copied by pointer (explicitly marked shallow in code via struct tags / assignments). (raw.githubusercontent.com)
So do not mutate clone.RetType (or the builtin ret type) in-place unless you first clone the FieldType; otherwise you risk affecting other references to the same *types.FieldType.
Citations:
- 1: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
- 2: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/builtin.go
- 3: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
🌐 Web query:
TiDB expression.ScalarFunction cached hash Clone behavior
💡 Result:
In TiDB (pkg/expression), ScalarFunction caches its hash in two fields: hashcode (used by HashCode()) and canonicalhashcode (used by CanonicalHashCode()). (raw.githubusercontent.com)
Clone behavior (cached hash):
(*ScalarFunction).Clone()does not copy the cachedhashcodeinto the clone; it only clones the underlying builtinFunctionand copies metadata. So a clonedScalarFunctionwill typically have an emptyhashcodecache and will recompute it on firstHashCode()call. (raw.githubusercontent.com)Clone()does copycanonicalhashcodeif it is already present (slices.Clone(sf.canonicalhashcode)). (raw.githubusercontent.com)
How the cache is (re)computed:
HashCode()returns the cachedsf.hashcodeif it’s already set; otherwise it callsReHashCode(sf)to rebuild it. (raw.githubusercontent.com)- If you mutate a scalar function’s arguments “in place”, TiDB expects you to call
ReHashCode(sf)afterward (“used after we change the argument in place”). (raw.githubusercontent.com)
Related gotcha:
RemapColumn()explicitly clears the cache (newSf.hashcode = nil) after rewriting args, ensuring the hash will be recomputed. (raw.githubusercontent.com)
Citations:
- 1: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
- 2: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
- 3: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
- 4: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
- 5: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
- 6: https://raw.githubusercontent.com/pingcap/tidb/master/pkg/expression/scalar_function.go
🌐 Web query:
TiDB pkg/expression ScalarFunction architecture design
💡 Result:
TiDB pkg/expression ScalarFunction: architecture/design (high level)
1) Where ScalarFunction sits in the expression system
- TiDB builds an expression tree (columns/constants/scalar-functions, etc.) and evaluates it during execution. The “refactor builtin” design emphasizes compile-time selection of typed function signatures so execution avoids repeated type-branching and
Datumconversions. [5] ScalarFunctionis the expression node that represents “a function returning a value”, and it mainly wraps:FuncName(case-insensitive string)RetType(static returnFieldType)Function(abuiltinFuncimplementation that does the real work) [1]
2) Creation path: name → function class → concrete signature (builtinFunc)
- The central factory is
newFunctionImpl(...), used byNewFunction,NewFunctionBase,NewFunctionTryFold, etc. [1] - Flow (simplified):
- Handle a few special forms (e.g.,
CAST,GET_VAR, sysdate-now rewrite). [1] - Lookup
funcNamein the globalfuncsregistry (and extension function registry). If missing, return “function not exists”. [1] - Do some argument/type preprocessing (e.g., infer NULL constant types for most functions). [1]
- Call
functionClass.getFunction(ctx, args)to select/build the best-matching builtin signature (the actualbuiltinFunc). [1][4][5] - Construct
ScalarFunction{ FuncName, RetType, Function }. Optionally run an init/check callback. [1] - Optionally run constant folding (default
NewFunctionfolds). [1]
- Handle a few special forms (e.g.,
3) builtinFunc and “signature structs” (how a function is implemented)
- TiDB models a SQL function as:
- a function class (does arg-count checks, return-type inference, overload resolution), and
- one or more signatures (concrete implementations), each implementing typed
evalXXmethods. This pattern is explicitly described in the refactor blog and dev guide. [4][5]
- Best-practice implementation steps (from the dev guide):
- Register the function in
funcsmap (name →functionClass) - Implement
getFunction(...)to build abaseBuiltinFunc(via helpers likenewBaseBuiltinFuncWithTp) and return a signature struct - Implement
evalString/evalInt/evalReal/… depending on return type. [4][5]
- Register the function in
4) Common embedding: baseBuiltinFunc carries shared metadata and behavior
- Most builtin signature structs embed
baseBuiltinFunc, which stores:args []Expression, return typetp *types.FieldType- pushdown signature id
pbCode tipb.ScalarFuncSig - collation/coercibility metadata
- vectorization and thread-safety/shareability flags, plus notes that new fields must be thread-safe/immutable if the expression can be shared across sessions. [2]
5) Execution: row-based and vectorized evaluation
ScalarFunctionexposes vectorized evaluation entrypoints likeVecEvalInt/Real/String/Decimal/Time/Duration/JSON/..., which delegate to the underlyingbuiltinFunc’svecEvalXXmethods. [1]- A
ScalarFunctionreports itself vectorizable only if both the function and its children are vectorizable (Vectorized()checks function + children). [1] - Builtin vectorization code is partly generator-driven (
//go:generate ... *_vec.go). [2]
6) Pushdown architecture hook: protobuf signature + PB conversion
- Builtin functions carry a
tipb.ScalarFuncSig(pbCode) used for coprocessor/TiKV/TiFlash pushdown encoding. [2] PbConverterconverts an expression tree intotipb.Exprprotobuf; if conversion fails, TiDB treats the expression as not pushdown-able (internal error “cannot be pushed down”). [3]
Key takeaway: ScalarFunction is a thin expression wrapper around a selected, typed builtinFunc signature. Most “architecture” lives in (a) the factory (newFunctionImpl) that selects a signature from the funcs registry via functionClass.getFunction, (b) the shared baseBuiltinFunc metadata (types/collation/pushdown id/vectorization), and (c) the execution/pushdown adapters (VecEval* and PB conversion). [1][2][3][4][5]
Clear hash caches after mutating cloned ScalarFunction arguments.
On lines 798–801, the code mutates newFunc.GetArgs()[i] after Clone(). While the argument slice itself is safely deep-cloned, the ScalarFunction carries a cached canonicalhashcode that is copied into the clone. Mutating args without invalidating this cache leaves it stale and inconsistent with the new argument values.
Add expression.ReHashCode(newFunc) after the mutation loop to clear and recompute the hash caches, or alternatively use expression.NewFunctionInternal(...) to reconstruct the function (as suggested in the original review).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/planner/core/joinorder/conflict_detector.go` around lines 785 - 802, The
cloned ScalarFunction's cached hash becomes stale after you mutate its arguments
in the block handling *expression.ScalarFunction (where you call e.Clone(),
assign newFunc.GetArgs()[i] = newArgs[i] after alignExprNotNullWithSchema), so
update the code to invalidate/recompute the hash cache: after the mutation loop
call expression.ReHashCode(newFunc) (or rebuild the function via
expression.NewFunctionInternal(...) instead of mutating the clone) so the
canonicalhashcode reflects the new arguments.
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
…/tidb into post_cdc_impl Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
4d69d5c to
2d85b13
Compare
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
Signed-off-by: guo-shaoge <shaoge1994@163.com>
|
/retest |
Signed-off-by: guo-shaoge <shaoge1994@163.com>
5cd0122 to
6e80f05
Compare
| "Warn": [ | ||
| "Warning 1815 leading hint is inapplicable, check if the leading hint table has join conditions with other tables" | ||
| ] | ||
| "Warn": null |
There was a problem hiding this comment.
Great, also an improvement for leading hint.
Signed-off-by: guo-shaoge <shaoge1994@163.com>
| "Selection root isnull(test.t3.i)", | ||
| "└─HashJoin root left outer join, left side:TableReader, equal:[eq(test.t1.i, test.t2.i)]", | ||
| "Projection root test.t1.i, test.t2.i, <nil>->test.t3.i", | ||
| "└─HashJoin root anti semi join, left side:HashJoin, equal:[eq(test.t2.i, test.t3.i)]", |
There was a problem hiding this comment.
Is this expected? outer join --> anti semi
There was a problem hiding this comment.
because we have this rule of converting outer to anti semi #64959,
and for this case, the first scenrio matches(IS NULL on the join condition column), that's why the outer join is converted to anti semi join.
| alignedEQConds = alignScalarFuncsNotNullWithSchema(alignedEQConds, join.Schema()) | ||
| alignedNonEQConds := alignExprsNotNullWithSchema(e.nonEQConds, join.Schema()) |
There was a problem hiding this comment.
Seems tricky, could you add some comments for these 2 lines
|
@guo-shaoge: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@guo-shaoge: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #66088
Problem Summary:
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit
Bug Fixes
Tests