Skip to content

tests: fix nil panic in TestDisableSchedulingServiceFallback#10282

Open
JmPotato wants to merge 1 commit intotikv:masterfrom
JmPotato:fix/flaky-disable-scheduling-fallback
Open

tests: fix nil panic in TestDisableSchedulingServiceFallback#10282
JmPotato wants to merge 1 commit intotikv:masterfrom
JmPotato:fix/flaky-disable-scheduling-fallback

Conversation

@JmPotato
Copy link
Member

@JmPotato JmPotato commented Mar 3, 2026

What problem does this PR solve?

Issue Number: Close #8926

TestDisableSchedulingServiceFallback is flaky due to a nil pointer dereference inside an assert.Eventually callback. The test calls tc.GetPrimaryServer().GetCluster().IsBackgroundJobsRunning() without nil checks, causing a SIGSEGV panic when either GetPrimaryServer() or GetCluster() returns nil during the scheduling server's initialization window.

Additionally, the test used re.NotNil() (a require assertion) inside an Eventually callback. Since Eventually may run the callback in a goroutine context, calling FailNow() from a non-test goroutine can cause a panic.

What is changed and how does it work?

  1. Replaced re.NotNil() require assertions inside the first Eventually callback with proper nil-check-and-return-false guards, preventing FailNow panics from non-test goroutines.
  2. Added nil guards for GetPrimaryServer() and GetCluster() in the second Eventually callback before calling IsBackgroundJobsRunning(), matching the existing defensive pattern already used in the sibling test TestSchedulingServiceFallback.
Add nil guards in Eventually callbacks to prevent nil pointer dereference.
Replace re.NotNil() require assertions with nil-check-and-return-false
guards to avoid FailNow panics from non-test goroutines.

Check List

Tests

  • Unit test

Release note

None.

Add nil guards for GetPrimaryServer() and GetCluster() inside the
Eventually callback, matching the existing pattern used in
TestSchedulingServiceFallback. Also replace require assertions
(re.NotNil) with nil-check-and-return-false inside the Eventually
callback to avoid FailNow panics from non-test goroutines.

Close tikv#8926

Signed-off-by: JmPotato <github@ipotato.me>
@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the dco. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 3, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 3, 2026

📝 Walkthrough

Walkthrough

Add defensive null checks in TestDisableSchedulingServiceFallback to prevent nil pointer dereferences when accessing server and cluster objects. Replace direct calls to GetServer() and GetRaftCluster() with guarded access patterns that safely return false if either is nil.

Changes

Cohort / File(s) Summary
Flaky test stabilization
tests/integrations/mcs/scheduling/server_test.go
Added defensive null checks in test callback to guard against nil dereferences on suite.pdLeader.GetServer() and GetRaftCluster(). Restructured call to IsSchedulingControllerRunning() with guarded sequential access pattern returning false if intermediate objects are nil.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • tikv/pd#10167 — Applies similar defensive nil-check refactoring patterns across test suites to eliminate flaky nil dereferences and timing issues

Suggested labels

size/S

Suggested reviewers

  • bufferflies
  • lhy1024
  • rleungx

Poem

🐰 A nil that dared to rear its head,
Now guarded safe with checks instead,
No panic crashes in the night,
The flaky test shines stable and bright! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: fixing a nil panic in the TestDisableSchedulingServiceFallback test.
Linked Issues check ✅ Passed The PR successfully addresses all coding requirements from issue #8926 by adding nil guards in Eventually callbacks and replacing require assertions with explicit nil-check guards to prevent panics.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the test flakiness in TestDisableSchedulingServiceFallback; no unrelated modifications are present.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description comprehensively addresses the problem, explains the root causes (nil pointer dereference and FailNow panics in goroutines), and details the exact changes made.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Mar 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.80%. Comparing base (5885cec) to head (d29e625).
⚠️ Report is 16 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master   #10282    +/-   ##
========================================
  Coverage   78.80%   78.80%            
========================================
  Files         523      525     +2     
  Lines       70529    70824   +295     
========================================
+ Hits        55580    55815   +235     
- Misses      10955    10996    +41     
- Partials     3994     4013    +19     
Flag Coverage Δ
unittests 78.80% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JmPotato
Copy link
Member Author

JmPotato commented Mar 4, 2026

/retest

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Mar 5, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Mar 5, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: okJiang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Mar 5, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-03-05 10:14:11.21012936 +0000 UTC m=+439495.788208544: ☑️ agreed by okJiang.

@ti-chi-bot ti-chi-bot bot added the approved label Mar 5, 2026
@okJiang
Copy link
Member

okJiang commented Mar 5, 2026

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dco-signoff: yes Indicates the PR's author has signed the dco. needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TestDisableSchedulingServiceFallback is flaky

2 participants