*: fix potential bug by ai by bufferflies · Pull Request #10312 · tikv/pd

bufferflies · 2026-03-07T11:56:23Z

What problem does this PR solve?

Issue Number: Ref #10326

What is changed and how does it work?

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code

Code changes

Has the configuration change
Has HTTP APIs changed (Don't forget to add the declarative for the new API)
Has persistent data change

Side effects

Possible performance regression
Increased code complexity
Breaking backward compatibility

Related changes

PR to update pingcap/docs/pingcap/docs-cn:
PR to update pingcap/tiup:
Need to cherry-pick to the release branch

Release note

None.

Summary by CodeRabbit

Bug Fixes
- Stricter input validation and clearer client-facing errors for scheduler config updates (engine, rule, alias, timeouts, store IDs, ranges), safer control flow to avoid runtime errors and unintended pauses/resumes, and more accurate HTTP status codes for invalid input.
Tests
- Added tests for invalid field types, malformed ranges, wrong scheduler types, and config update error cases to prevent regressions.

Signed-off-by: tongjian <1045931706@qq.com>

coderabbitai · 2026-03-07T11:57:36Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds defensive input validation and type checks across multiple scheduler handlers, tightens HTTP error responses and control flow to avoid panics/nil derefs, and introduces HTTP/unit tests covering invalid input scenarios.

Changes

Cohort / File(s)	Summary
Balance Range Scheduler `pkg/schedule/schedulers/balance_range.go`, `pkg/schedule/schedulers/balance_range_test.go`	Validate `engine`, `rule`, `alias` as strings and restrict `engine` to allowed values; convert/validate `rule` via `core.NewRule`; parse optional `timeout` string to duration with positivity check; update error messages. Adds test `TestAddBalanceRangeJobWithInvalidFieldType`.
Evict Leader Scheduler `pkg/schedule/schedulers/evict_leader.go`, `pkg/schedule/schedulers/evict_leader_test.go`	`pauseLeaderTransferIfStoreNotExist` early-returns when store tracked and surfaces PauseLeaderTransfer errors. `updateConfig` accepts `ranges` as `[]string` or `[]any` with per-element validation and falls back to existing ranges when `id` present. Adds HTTP tests for valid string-array ranges and invalid element types.
Grant Hot Region Scheduler `pkg/schedule/schedulers/grant_hot_region.go`, `pkg/schedule/schedulers/grant_hot_region_test.go`	Validate `"store-leader-id"` is a string before parsing to uint64; return 400 on wrong type. Adds test asserting 400 Bad Request for invalid `store-leader-id` type.
Transfer Witness Leader Scheduler `pkg/schedule/schedulers/transfer_witness_leader.go`, `pkg/schedule/schedulers/transfer_witness_leader_test.go`	`RecvRegionInfo` now safely checks type assertion and returns nil on mismatch to avoid panics. Adds test `TestRecvRegionInfoWithWrongSchedulerType`.
Scheduler Controller `pkg/schedule/schedulers/scheduler_controller.go`	Cache `RecvRegionInfo` result in `CheckTransferWitnessLeader` and return early on nil to prevent nil channel usage.
Scheduler API `server/api/scheduler.go`	Change error responses: use BadRequest for invalid client inputs, preserve default timeout when option missing, use upstream response status on redirect delete errors, and ensure early returns after sending error responses.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

okJiang
JmPotato

Poem

🐰 I hopped through fields with twitching nose,

engines checked, rules lined in tidy rows,
timeouts parsed, and ranges neatly typed,
no panics, no nils — my work's well-typed,
carrot cake later, for careful code that glows.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The PR title '*: fix potential bug by ai' is overly vague and generic, using non-descriptive phrasing like 'bug' without specifying what issues are being fixed or what areas are affected.	Revise the title to be more specific and descriptive, e.g., 'Add input validation and nil checks to scheduler handlers' or 'Fix nil dereference and type assertion bugs in schedulers'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description follows the required template structure with sections for issue number, changes, checklist, and release notes. However, the 'What is changed and how does it work?' section lacks a detailed commit message explaining the actual implementation details and rationale for the changes across multiple scheduler files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

server/api/scheduler.go (1)
145-151: ⚠️ Potential issue | 🟠 Major

Malformed timeout values still fall back to 30m.

apiutil.CollectStringOption returns ErrOptionNotExist both when timeout is absent and when it is present but not a string (pkg/utils/apiutil/apiutil.go:337-345). With this branch, a payload like {"timeout":123} still appends the default and creates the scheduler instead of returning 400. Check raw key presence in input before applying the default.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/api/scheduler.go` around lines 145 - 151, The branch treating
ErrOptionNotExist as "use default" is incorrect because
apiutil.CollectStringOption returns ErrOptionNotExist both for a missing key and
for a key with a non-string value; update the logic in the scheduler handler
around apiutil.CollectStringOption("timeout", input, collector) to first check
raw presence of the "timeout" key in input (e.g., via a map lookup on input) and
only call collector(defaultTimeout) when the key is absent; if the key exists
but CollectStringOption returns ErrOptionNotExist (meaning the value is
malformed), return a 400 using h.r.JSON with the error message instead of
falling back to default; keep the existing behavior for other errors.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/schedule/schedulers/balance_range.go`:
- Around line 351-353: The early-nil-start guard in checkJob is letting an
invalid running job be marked finished but then allows checkJob to continue and
reach prepare/Schedule; fix this by detecting job == nil or job.Start == nil,
calling finishLocked(job) to mark it finished/clean up, and then immediately
return (stop further processing) from checkJob so it cannot proceed to prepare
or Schedule; update the branch around job == nil || job.Start == nil in checkJob
to perform finishLocked and return immediately.

In `@pkg/schedule/schedulers/evict_leader.go`:
- Around line 176-182: The helper currently conflates "store already configured"
with "we paused it", causing rollbacks in updateConfig to resume stores that
this request never paused; change the helper used in evict_leader logic (the
function that checks conf.StoreIDWithRanges and calls
conf.cluster.PauseLeaderTransfer) to return separate flags (e.g., exists and
pausedByRequest or a small result struct) instead of a single boolean, set
pausedByRequest only when conf.cluster.PauseLeaderTransfer(id, constant.In)
succeeds, and update all callers (including resumeLeaderTransferIfExist and the
rollback paths in updateConfig) to only call resumeLeaderTransfer when
pausedByRequest is true (leave stores that merely existed untouched).

In `@pkg/schedule/schedulers/grant_leader.go`:
- Around line 268-293: The error branches in the ranges parsing block always
call handler.config.cluster.ResumeLeaderTransfer(id, constant.Out) which can
unpause existing stores; change the logic in
pkg/schedule/schedulers/grant_leader.go so ResumeLeaderTransfer is only called
if this request previously paused the store. Introduce a local boolean (e.g.,
paused) set to true where PauseLeaderTransfer is invoked for new stores (the
existing PauseLeaderTransfer call around lines ~257-263) and then check that
paused flag before calling handler.config.cluster.ResumeLeaderTransfer(id,
constant.Out) in the ranges validation error/default branches inside the ranges
handling switch.

---

Outside diff comments:
In `@server/api/scheduler.go`:
- Around line 145-151: The branch treating ErrOptionNotExist as "use default" is
incorrect because apiutil.CollectStringOption returns ErrOptionNotExist both for
a missing key and for a key with a non-string value; update the logic in the
scheduler handler around apiutil.CollectStringOption("timeout", input,
collector) to first check raw presence of the "timeout" key in input (e.g., via
a map lookup on input) and only call collector(defaultTimeout) when the key is
absent; if the key exists but CollectStringOption returns ErrOptionNotExist
(meaning the value is malformed), return a 400 using h.r.JSON with the error
message instead of falling back to default; keep the existing behavior for other
errors.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d0e9d009-8d30-4d17-8018-37955c835928

📥 Commits

Reviewing files that changed from the base of the PR and between 95cde21 and db0ecac.

📒 Files selected for processing (12)

pkg/schedule/schedulers/balance_range.go
pkg/schedule/schedulers/balance_range_test.go
pkg/schedule/schedulers/evict_leader.go
pkg/schedule/schedulers/evict_leader_test.go
pkg/schedule/schedulers/grant_hot_region.go
pkg/schedule/schedulers/grant_hot_region_test.go
pkg/schedule/schedulers/grant_leader.go
pkg/schedule/schedulers/grant_leader_test.go
pkg/schedule/schedulers/scheduler_controller.go
pkg/schedule/schedulers/transfer_witness_leader.go
pkg/schedule/schedulers/transfer_witness_leader_test.go
server/api/scheduler.go

coderabbitai · 2026-03-07T12:11:22Z

pkg/schedule/schedulers/balance_range.go

+	if job == nil || job.Start == nil {
+		return true
+	}


⚠️ Potential issue | 🟡 Minor

Nil-start jobs are still eligible for one scheduling round.

This avoids the panic, but checkJob keeps going after finishLocked (Lines 470-496). A running job restored without Start will be marked finished here and can still reach prepare/Schedule once in the same tick. Return immediately after finishing that invalid job state.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/schedule/schedulers/balance_range.go` around lines 351 - 353, The early-nil-start guard in checkJob is letting an invalid running job be marked finished but then allows checkJob to continue and reach prepare/Schedule; fix this by detecting job == nil or job.Start == nil, calling finishLocked(job) to mark it finished/clean up, and then immediately return (stop further processing) from checkJob so it cannot proceed to prepare or Schedule; update the branch around job == nil || job.Start == nil in checkJob to perform finishLocked and return immediately.

coderabbitai · 2026-03-07T12:11:22Z

pkg/schedule/schedulers/evict_leader.go

+	if _, exist := conf.StoreIDWithRanges[id]; exist {
+		return true, nil
+	}
+	if err := conf.cluster.PauseLeaderTransfer(id, constant.In); err != nil {
+		return false, err
 	}
-	return true, nil
+	return false, nil


⚠️ Potential issue | 🟠 Major

Bad updates can now unpause an existing evict-leader store.

Because this helper now returns true for stores that are already configured, the rollback paths in updateConfig still call resumeLeaderTransferIfExist(id) on 400s (Lines 424, 442, 449, 454, and 464) even though this request never paused that store. A request like {"store_id":1,"batch":0} will now fail validation and also clear the scheduler’s pause state for store 1. Track pausedByRequest separately from exists, and only resume when this request actually called PauseLeaderTransfer.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/schedule/schedulers/evict_leader.go` around lines 176 - 182, The helper currently conflates "store already configured" with "we paused it", causing rollbacks in updateConfig to resume stores that this request never paused; change the helper used in evict_leader logic (the function that checks conf.StoreIDWithRanges and calls conf.cluster.PauseLeaderTransfer) to return separate flags (e.g., exists and pausedByRequest or a small result struct) instead of a single boolean, set pausedByRequest only when conf.cluster.PauseLeaderTransfer(id, constant.In) succeeds, and update all callers (including resumeLeaderTransferIfExist and the rollback paths in updateConfig) to only call resumeLeaderTransfer when pausedByRequest is true (leave stores that merely existed untouched).

pkg/schedule/schedulers/grant_leader.go

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/schedule/schedulers/balance_range.go (1)

350-351: ⚠️ Potential issue | 🔴 Critical

shouldFinished still panics on nil job or missing Start.

This still dereferences job.Start unconditionally, so the new regression case with job == nil / job.Start == nil will panic instead of returning safely.

Suggested fix

 func (job *balanceRangeSchedulerJob) shouldFinished() bool {
+	if job == nil || job.Start == nil {
+		return true
+	}
 	return time.Since(*job.Start) > job.Timeout
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/schedule/schedulers/balance_range.go` around lines 350 - 351, The
shouldFinished method currently dereferences job.Start unconditionally causing a
panic when job or job.Start is nil; update
balanceRangeSchedulerJob.shouldFinished to first check for job != nil and
job.Start != nil and return false (not finished) if either is nil, then perform
the time.Since(*job.Start) > job.Timeout comparison; ensure you reference the
balanceRangeSchedulerJob receiver, its Start field and Timeout field when adding
the nil-guards so the behavior is safe for the regression case.

♻️ Duplicate comments (1)

pkg/schedule/schedulers/balance_range.go (1)

467-475: ⚠️ Potential issue | 🟠 Major

Return immediately after timing out the job.

Once this branch marks the job as finished, checkJob still falls through into prepare and can schedule one more operator in the same tick. Stop here after cleanup.

Suggested fix

 	if job.shouldFinished() {
 		if err := s.conf.finishLocked(index); err != nil {
 			balancePersistFailedCounter.Inc()
 			return false
 		}
 		s.cleanJobStatus(cluster, job)
 		balanceRangeExpiredCounter.Inc()
 		log.Info("balance key range job finished due to timeout", zap.String("alias", job.Alias), zap.Uint64("job-id", job.JobID))
+		return false
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/schedule/schedulers/balance_range.go` around lines 467 - 475, The
checkJob branch that handles timeout needs to return immediately after marking
the job finished to avoid falling through into prepare; update the block inside
checkJob where it calls job.shouldFinished(), s.conf.finishLocked(index),
s.cleanJobStatus(cluster, job), balanceRangeExpiredCounter.Inc(), and the
log.Info call so that it returns false (or otherwise exits) right after cleanup
and logging, preventing prepare/prepareRange from scheduling another operator in
the same tick.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/schedule/schedulers/balance_range.go`:
- Around line 350-351: The shouldFinished method currently dereferences
job.Start unconditionally causing a panic when job or job.Start is nil; update
balanceRangeSchedulerJob.shouldFinished to first check for job != nil and
job.Start != nil and return false (not finished) if either is nil, then perform
the time.Since(*job.Start) > job.Timeout comparison; ensure you reference the
balanceRangeSchedulerJob receiver, its Start field and Timeout field when adding
the nil-guards so the behavior is safe for the regression case.

---

Duplicate comments:
In `@pkg/schedule/schedulers/balance_range.go`:
- Around line 467-475: The checkJob branch that handles timeout needs to return
immediately after marking the job finished to avoid falling through into
prepare; update the block inside checkJob where it calls job.shouldFinished(),
s.conf.finishLocked(index), s.cleanJobStatus(cluster, job),
balanceRangeExpiredCounter.Inc(), and the log.Info call so that it returns false
(or otherwise exits) right after cleanup and logging, preventing
prepare/prepareRange from scheduling another operator in the same tick.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c16d46e1-b0bf-4117-a0bd-4a0dfe8d55ed

📥 Commits

Reviewing files that changed from the base of the PR and between 9d2b70f and 65ccf3a.

📒 Files selected for processing (3)

pkg/schedule/schedulers/balance_range.go
pkg/schedule/schedulers/balance_range_test.go
pkg/schedule/schedulers/evict_leader.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/schedule/schedulers/evict_leader.go

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

pkg/schedule/schedulers/evict_leader.go (1)

176-180: ⚠️ Potential issue | 🟠 Major

Keep alreadyConfigured separate from pausedByRequest.

This helper now returns the same true value for two different states, but the rollback paths at Line 424, Line 442, Line 449, Line 454, and Line 464 still treat it as “this request paused the store”. A rejected update can therefore clear the existing pause on an already-configured store even though this request never called PauseLeaderTransfer.

💡 One way to fix it

+type pauseLeaderTransferResult struct {
+	alreadyConfigured bool
+	pausedByRequest   bool
+}
+
-func (conf *evictLeaderSchedulerConfig) pauseLeaderTransferIfStoreNotExist(id uint64) (bool, error) {
+func (conf *evictLeaderSchedulerConfig) pauseLeaderTransferIfStoreNotExist(id uint64) (pauseLeaderTransferResult, error) {
 	conf.RLock()
 	defer conf.RUnlock()
 	if _, exist := conf.StoreIDWithRanges[id]; exist {
-		return true, nil
+		return pauseLeaderTransferResult{alreadyConfigured: true}, nil
 	}
 	if err := conf.cluster.PauseLeaderTransfer(id, constant.In); err != nil {
-		return false, err
+		return pauseLeaderTransferResult{}, err
 	}
-	return true, nil
+	return pauseLeaderTransferResult{pausedByRequest: true}, nil
 }

-	exist, err = handler.config.pauseLeaderTransferIfStoreNotExist(id)
+	pauseState, err = handler.config.pauseLeaderTransferIfStoreNotExist(id)
 	...
-		handler.config.resumeLeaderTransferIfExist(id)
+		if pauseState.pausedByRequest {
+			handler.config.resumeLeaderTransferIfExist(id)
+		}
 		...
-	} else if exist {
+	} else if pauseState.alreadyConfigured {
 		ranges = handler.config.getRanges(id)
 	}

Also applies to: 423-465

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/schedule/schedulers/evict_leader.go` around lines 176 - 180, The helper
that checks/conflicts and calls conf.cluster.PauseLeaderTransfer conflates
"alreadyConfigured" and "pausedByRequest" into a single boolean; change it to
return two distinct booleans (e.g., alreadyConfigured, pausedByRequest) so
callers can distinguish stores that were paused by this request from stores that
were already configured; update the helper (the code that reads
conf.StoreIDWithRanges and calls conf.cluster.PauseLeaderTransfer) to set
alreadyConfigured=true when the store exists in conf.StoreIDWithRanges and set
pausedByRequest=true only when PauseLeaderTransfer is actually invoked
successfully; then update all rollback paths that currently assume the single
true to only call the resume/unpause logic when pausedByRequest is true (leave
alreadyConfigured alone).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/schedule/schedulers/balance_range.go`:
- Around line 121-135: The parsed timeout can be negative which causes
shouldFinished() to immediately return true; after parsing timeoutStr with
time.ParseDuration in the block surrounding timeoutStr and err, add a validation
that the resulting timeout is strictly positive (e.g., timeout > 0) and if not,
return a 400 via handler.rd.JSON with a clear message (e.g., "timeout must be
positive") and do not assign job.Timeout; keep existing handling of empty string
and parse errors intact and reference timeoutStr, time.ParseDuration,
job.Timeout and shouldFinished() when locating the change.

---

Duplicate comments:
In `@pkg/schedule/schedulers/evict_leader.go`:
- Around line 176-180: The helper that checks/conflicts and calls
conf.cluster.PauseLeaderTransfer conflates "alreadyConfigured" and
"pausedByRequest" into a single boolean; change it to return two distinct
booleans (e.g., alreadyConfigured, pausedByRequest) so callers can distinguish
stores that were paused by this request from stores that were already
configured; update the helper (the code that reads conf.StoreIDWithRanges and
calls conf.cluster.PauseLeaderTransfer) to set alreadyConfigured=true when the
store exists in conf.StoreIDWithRanges and set pausedByRequest=true only when
PauseLeaderTransfer is actually invoked successfully; then update all rollback
paths that currently assume the single true to only call the resume/unpause
logic when pausedByRequest is true (leave alreadyConfigured alone).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0dfccb36-3fe0-44a2-9af9-d9cd001700a9

📥 Commits

Reviewing files that changed from the base of the PR and between 65ccf3a and 1d91e5f.

📒 Files selected for processing (3)

pkg/schedule/schedulers/balance_range.go
pkg/schedule/schedulers/balance_range_test.go
pkg/schedule/schedulers/evict_leader.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/schedule/schedulers/balance_range_test.go

coderabbitai · 2026-03-10T03:37:23Z

pkg/schedule/schedulers/balance_range.go

+	if timeoutVal, exists := input["timeout"]; exists {
+		timeoutStr, ok := timeoutVal.(string)
+		if !ok {
+			handler.rd.JSON(w, http.StatusBadRequest, "timeout must be a string")
 			return
 		}
-		job.Timeout = timeout
+		if len(timeoutStr) > 0 {
+			timeout, err := time.ParseDuration(timeoutStr)
+			if err != nil {
+				handler.rd.JSON(w, http.StatusBadRequest, fmt.Sprintf("timeout:%s is invalid", timeoutStr))
+				return
+			}
+			job.Timeout = timeout
+		}
 	}


⚠️ Potential issue | 🟡 Minor

Consider validating that timeout is positive.

time.ParseDuration accepts negative durations (e.g., "-5m"). A negative timeout would cause shouldFinished() to return true immediately, finishing the job without performing any work. Add a check to reject non-positive timeouts.

🛡️ Suggested fix

if len(timeoutStr) > 0 { timeout, err := time.ParseDuration(timeoutStr) if err != nil { handler.rd.JSON(w, http.StatusBadRequest, fmt.Sprintf("timeout:%s is invalid", timeoutStr)) return } + if timeout <= 0 { + handler.rd.JSON(w, http.StatusBadRequest, "timeout must be positive") + return + } job.Timeout = timeout }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/schedule/schedulers/balance_range.go` around lines 121 - 135, The parsed timeout can be negative which causes shouldFinished() to immediately return true; after parsing timeoutStr with time.ParseDuration in the block surrounding timeoutStr and err, add a validation that the resulting timeout is strictly positive (e.g., timeout > 0) and if not, return a 400 via handler.rd.JSON with a clear message (e.g., "timeout must be positive") and do not assign job.Timeout; keep existing handling of empty string and parse errors intact and reference timeoutStr, time.ParseDuration, job.Timeout and shouldFinished() when locating the change.

codecov · 2026-03-10T03:40:38Z

Codecov Report

❌ Patch coverage is 37.97468% with 49 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.82%. Comparing base (525524f) to head (1d91e5f).
⚠️ Report is 12 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #10312      +/-   ##
==========================================
+ Coverage   78.78%   78.82%   +0.04%     
==========================================
  Files         525      527       +2     
  Lines       70824    70973     +149     
==========================================
+ Hits        55796    55944     +148     
- Misses      11004    11012       +8     
+ Partials     4024     4017       -7

Flag	Coverage Δ
unittests	`78.82% <37.97%> (+0.04%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

liyishuai

Tesed on my side. The AI review is worth considering. Otherwise LGTM.

ti-chi-bot · 2026-03-10T04:12:34Z

@liyishuai: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Tesed on my side. The AI review is worth considering. Otherwise LGTM.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

ti-chi-bot · 2026-03-10T04:12:35Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: liyishuai
Once this PR has been reviewed and has the lgtm label, please assign bufferflies for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: tongjian <1045931706@qq.com>

ti-chi-bot · 2026-03-10T08:27:47Z

@bufferflies: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-unit-test-next-gen-1	`c4f9234`	link	true	`/test pull-unit-test-next-gen-1`
pull-unit-test-next-gen-2	`c4f9234`	link	true	`/test pull-unit-test-next-gen-2`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

okJiang

Please link the corresponding issue and explain your changes in the PR description.

Even when using AI to fix bugs, we should still follow the process of submitting a bug issue -> pull request. The process of submitting a bug issue can also be completed by AI, and we should try to follow atomic commits, with one PR fixing one issue.

ai scan potential panix

db0ecac

Signed-off-by: tongjian <1045931706@qq.com>

ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the dco. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 7, 2026

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

bufferflies force-pushed the bugfix/ai branch from 9d2b70f to 65ccf3a Compare March 9, 2026 12:24

coderabbitai bot reviewed Mar 9, 2026

View reviewed changes

bufferflies force-pushed the bugfix/ai branch from 65ccf3a to 1d91e5f Compare March 10, 2026 03:30

ti-chi-bot bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 10, 2026

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

liyishuai approved these changes Mar 10, 2026

View reviewed changes

bufferflies mentioned this pull request Mar 10, 2026

schedule, server: harden scheduler config input validation #10311

Closed

ti-chi-bot bot removed the do-not-merge/needs-linked-issue label Mar 10, 2026

remove unuseless change

c4f9234

Signed-off-by: tongjian <1045931706@qq.com>

bufferflies force-pushed the bugfix/ai branch from 1d91e5f to c4f9234 Compare March 10, 2026 08:21

ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 10, 2026

bufferflies changed the title *: fix potential panic by ai *: fix potential bug by ai Mar 10, 2026

bufferflies requested a review from okJiang March 10, 2026 08:26

okJiang reviewed Mar 10, 2026

View reviewed changes

Conversation

bufferflies commented Mar 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

liyishuai left a comment

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

ti-chi-bot bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

okJiang left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bufferflies commented Mar 7, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 7, 2026 •

edited

Loading

codecov bot commented Mar 10, 2026 •

edited

Loading

ti-chi-bot bot commented Mar 10, 2026 •

edited

Loading