*: migrate simple failpoint.Inject to failpoint.InjectCall#10300
*: migrate simple failpoint.Inject to failpoint.InjectCall#10300JmPotato wants to merge 1 commit intotikv:masterfrom
Conversation
Migrate 19 simple `failpoint.Inject()` call sites to the newer
`failpoint.InjectCall()` / `failpoint.EnableCall()` APIs. This
eliminates closure bodies from production code and consolidates
test injection logic in test files with type-safe callbacks.
Categories migrated:
- Empty/nil callbacks (5 sites): bare InjectCall with no args
- Sleep-only callbacks (12 sites): move time.Sleep to test-side EnableCall
- Panic callbacks (2 sites): move panic to test-side EnableCall
For "pause" failpoints, the pattern is converted from
`failpoint.Enable(path, "pause")` to a channel-based
`failpoint.EnableCall(path, func() { <-ch })` approach.
Signed-off-by: JmPotato <github@ipotato.me>
📝 WalkthroughWalkthroughThis pull request systematically migrates failpoint invocations across the codebase from using Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related issues
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integrations/mcs/tso/api_test.go`:
- Line 137: The failpoint enabled with
failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", ...)
should be disabled in a deferred call to guarantee cleanup on test failure:
immediately after calling failpoint.EnableCall, add a defer
failpoint.Disable("github.com/tikv/pd/server/delayStartServerLoop") so the
failpoint is always cleared, and then remove the later explicit
failpoint.Disable call that currently appears at the end of the test.
In `@tests/integrations/mcs/tso/keyspace_group_manager_test.go`:
- Around line 466-467: The failpoint enable call for
"github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn" uses a channel ch
but lacks a fail-safe cleanup; change the code to call failpoint.EnableCall(...)
and immediately defer
failpoint.Disable("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn")
so the failpoint is disabled on any early return/panic, and then remove the
later unconditional cleanup that disables the same failpoint to avoid
double-handling; reference the ch variable and the EnableCall / Disable symbols
to locate and update the test.
In `@tests/server/cluster/cluster_test.go`:
- Line 880: The failpoint opened with
failpoint.EnableCall("github.com/tikv/pd/server/cluster/versionChangeConcurrency",
...) isn't disabled if an assertion between lines 880–894 fails; immediately
after calling failpoint.EnableCall(...) add a deferred cleanup like defer
failpoint.Disable("github.com/tikv/pd/server/cluster/versionChangeConcurrency")
so the failpoint is always disabled even on test failures, mirroring the pattern
used in TestRegionStatistics and ensuring proper cleanup around the enabled
failpoint.
In `@tests/server/member/member_test.go`:
- Around line 242-243: The failpoint enabled via
failpoint.EnableCall("github.com/tikv/pd/server/raftclusterIsBusy", ...) is not
guaranteed to be disabled on early test exit; after successfully enabling the
failpoint (the call creating ch and calling failpoint.EnableCall) immediately
register cleanup with defer: defer
failpoint.Disable("github.com/tikv/pd/server/raftclusterIsBusy") and also defer
close(ch) (or otherwise signal the goroutine) so the failpoint is always
disabled and the channel closed even if the test fails early.
In `@tests/server/server_test.go`:
- Around line 225-226: The test enables a failpoint with
failpoint.EnableCall("github.com/tikv/pd/server/delayProcess", func() { <-ch })
and creates a channel ch but does not guarantee cleanup on test exit; wrap the
enable + channel creation with a deferred cleanup that uses a sync.Once to both
disable the failpoint
(failpoint.Disable("github.com/tikv/pd/server/delayProcess")) and close ch so
that regardless of assertion failures or panics the failpoint is disabled and
the blocked goroutines are released; place the sync.Once and the defer
immediately after creating ch and calling failpoint.EnableCall to ensure the
cleanup always runs.
In `@tests/server/tso/tso_test.go`:
- Line 138: Enable the failpoint with
failpoint.EnableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp", ...) and
immediately register cleanup to disable it (e.g., defer
failpoint.DisableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp") or
t.Cleanup(func(){
failpoint.DisableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp") })) right
after the EnableCall succeeds so any early test failures won't leak the
failpoint into other tests.
In `@tools/pd-ctl/tests/keyspace/keyspace_test.go`:
- Line 55: The failpoint is enabled via failpoint.EnableCall (wrapped by
re.NoError) but not guaranteed to be disabled if the test fails early;
immediately after enabling the failpoint (the call to failpoint.EnableCall),
register a teardown that disables it (use failpoint.DisableCall or
failpoint.Disable) via t.Cleanup(...) or defer to ensure the failpoint is always
turned off for that test; update the test that calls failpoint.EnableCall to add
this cleanup registration right after the successful enable so later tests are
not polluted.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b1456bfe-b696-4dfc-a1e0-23bdec775603
📒 Files selected for processing (33)
client/clients/tso/client.gopkg/election/leadership.gopkg/election/leadership_test.gopkg/gc/gc_state_manager.gopkg/gc/gc_state_manager_test.gopkg/keyspace/tso_keyspace_group.gopkg/mcs/scheduling/server/cluster.gopkg/schedule/checker/checker_controller.gopkg/schedule/checker/rule_checker_test.gopkg/schedule/operator/operator_controller.gopkg/schedule/operator/operator_controller_test.gopkg/storage/kv/etcd_kv.gopkg/storage/storage_tso_test.gopkg/tso/tso.goserver/api/region.goserver/cluster/cluster.goserver/cluster/cluster_test.goserver/grpc_service.goserver/server.gotests/integrations/client/client_test.gotests/integrations/mcs/router/server_test.gotests/integrations/mcs/tso/api_test.gotests/integrations/mcs/tso/keyspace_group_manager_test.gotests/integrations/tso/client_test.gotests/server/api/region_test.gotests/server/api/version_test.gotests/server/cluster/cluster_test.gotests/server/member/member_test.gotests/server/server_test.gotests/server/tso/tso_test.gotests/server/watch/leader_watch_test.gotools/pd-ctl/tests/keyspace/keyspace_group_test.gotools/pd-ctl/tests/keyspace/keyspace_test.go
| func TestTSOServerStartFirst(t *testing.T) { | ||
| re := require.New(t) | ||
| re.NoError(failpoint.Enable("github.com/tikv/pd/server/delayStartServerLoop", `return(true)`)) | ||
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", func() { time.Sleep(2 * time.Second) })) |
There was a problem hiding this comment.
Defer failpoint disable to ensure cleanup on test failure.
The failpoint.Disable at line 198 is not deferred. If any assertion fails between lines 137 and 198, the failpoint remains enabled and could affect subsequent tests. Based on learnings: "Keep failpoints enabled only for tests; disable immediately after."
🛡️ Proposed fix to defer disable
re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", func() { time.Sleep(2 * time.Second) }))
+ defer failpoint.Disable("github.com/tikv/pd/server/delayStartServerLoop")
ctx, cancel := context.WithCancel(context.Background())Then remove the explicit disable at line 198:
- re.NoError(failpoint.Disable("github.com/tikv/pd/server/delayStartServerLoop"))
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", func() { time.Sleep(2 * time.Second) })) | |
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", func() { time.Sleep(2 * time.Second) })) | |
| defer failpoint.Disable("github.com/tikv/pd/server/delayStartServerLoop") | |
| ctx, cancel := context.WithCancel(context.Background()) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integrations/mcs/tso/api_test.go` at line 137, The failpoint enabled
with failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", ...)
should be disabled in a deferred call to guarantee cleanup on test failure:
immediately after calling failpoint.EnableCall, add a defer
failpoint.Disable("github.com/tikv/pd/server/delayStartServerLoop") so the
failpoint is always cleared, and then remove the later explicit
failpoint.Disable call that currently appears at the end of the test.
| ch := make(chan struct{}) | ||
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn", func() { <-ch })) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find and read the test file around lines 466-525
cat -n tests/integrations/mcs/tso/keyspace_group_manager_test.go | sed -n '450,530p'Repository: tikv/pd
Length of output: 3923
Use defer to make failpoint cleanup failure-safe.
The test has multiple early-exit paths (panics from Must* functions and re.NoError() assertions at lines 471, 481, 487, 491, 497, 500, 503–505, 514–516, 519–520). If the test fails before reaching line 524–525, the failpoint remains enabled and contaminates subsequent tests.
Suggested fix
ch := make(chan struct{})
re.NoError(failpoint.EnableCall("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn", func() { <-ch }))
+defer func() {
+ select {
+ case <-ch:
+ default:
+ close(ch)
+ }
+ re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn"))
+}()
// Create the keyspace group `oldID` with keyspaces [444, 555, 666].
oldID := suite.allocID()Then remove the unconditional cleanup at lines 524–525.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ch := make(chan struct{}) | |
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn", func() { <-ch })) | |
| ch := make(chan struct{}) | |
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn", func() { <-ch })) | |
| defer func() { | |
| select { | |
| case <-ch: | |
| default: | |
| close(ch) | |
| } | |
| re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn")) | |
| }() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integrations/mcs/tso/keyspace_group_manager_test.go` around lines 466 -
467, The failpoint enable call for
"github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn" uses a channel ch
but lacks a fail-safe cleanup; change the code to call failpoint.EnableCall(...)
and immediately defer
failpoint.Disable("github.com/tikv/pd/pkg/keyspace/pauseFinishSplitBeforeTxn")
so the failpoint is disabled on any early return/panic, and then remove the
later unconditional cleanup that disables the same failpoint to avoid
double-handling; reference the ch variable and the EnableCall / Disable symbols
to locate and update the test.
| store := newMetaStore(storeID, "mock://tikv-1:1", "2.1.0", metapb.StoreState_Up, getTestDeployPath(storeID)) | ||
| var wg sync.WaitGroup | ||
| re.NoError(failpoint.Enable("github.com/tikv/pd/server/cluster/versionChangeConcurrency", `return(true)`)) | ||
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/cluster/versionChangeConcurrency", func() { time.Sleep(500 * time.Millisecond) })) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and inspect the test file around lines 880-895
head -n 900 tests/server/cluster/cluster_test.go | tail -n 30Repository: tikv/pd
Length of output: 1186
🏁 Script executed:
# Get more context around the failpoint enable/disable
sed -n '870,900p' tests/server/cluster/cluster_test.goRepository: tikv/pd
Length of output: 1206
🏁 Script executed:
# Let's look at the broader function context to see if there's a defer pattern or cleanup strategy
sed -n '850,900p' tests/server/cluster/cluster_test.goRepository: tikv/pd
Length of output: 1769
🏁 Script executed:
# Check if there are other tests in this file that use defer for failpoint cleanup
rg -A 2 "failpoint.EnableCall" tests/server/cluster/cluster_test.goRepository: tikv/pd
Length of output: 217
🏁 Script executed:
# Check if other tests in the file use defer for failpoint cleanup
rg -B 5 "failpoint.Disable" tests/server/cluster/cluster_test.go | head -60Repository: tikv/pd
Length of output: 2553
🏁 Script executed:
# Search across the entire codebase for failpoint defer patterns
rg -A 1 "defer.*failpoint.Disable" --type go | head -30Repository: tikv/pd
Length of output: 33
🏁 Script executed:
# Look at the test function name and surrounding context to get complete picture
sed -n '860,905p' tests/server/cluster/cluster_test.goRepository: tikv/pd
Length of output: 1606
Defer the failpoint disable to ensure cleanup even if an assertion fails.
The test has multiple assertions between the failpoint.EnableCall at line 880 and the failpoint.Disable at line 894. If any assertion fails (lines 888–889), the disable never executes, leaving the failpoint enabled for subsequent tests. Use a deferred cleanup immediately after enabling the failpoint, as shown in similar tests in this file (e.g., TestRegionStatistics).
Suggested fix
re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/cluster/versionChangeConcurrency", func() { time.Sleep(500 * time.Millisecond) }))
+defer func() {
+ re.NoError(failpoint.Disable("github.com/tikv/pd/server/cluster/versionChangeConcurrency"))
+}()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/server/cluster/cluster_test.go` at line 880, The failpoint opened with
failpoint.EnableCall("github.com/tikv/pd/server/cluster/versionChangeConcurrency",
...) isn't disabled if an assertion between lines 880–894 fails; immediately
after calling failpoint.EnableCall(...) add a deferred cleanup like defer
failpoint.Disable("github.com/tikv/pd/server/cluster/versionChangeConcurrency")
so the failpoint is always disabled even on test failures, mirroring the pattern
used in TestRegionStatistics and ensuring proper cleanup around the enabled
failpoint.
| ch := make(chan struct{}) | ||
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/raftclusterIsBusy", func() { <-ch })) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n tests/server/member/member_test.go | sed -n '230,260p'Repository: tikv/pd
Length of output: 1197
🏁 Script executed:
cat -n tests/server/member/member_test.go | sed -n '220,250p'Repository: tikv/pd
Length of output: 1276
Guard failpoint teardown with defer in this test.
Cleanup currently depends on reaching the tail. Early failures at lines 244–247 can leave the failpoint enabled and block goroutines, violating the requirement to disable failpoints immediately after test execution.
Suggested fix
ch := make(chan struct{})
re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/raftclusterIsBusy", func() { <-ch }))
+defer func() {
+ select {
+ case <-ch:
+ default:
+ close(ch)
+ }
+ re.NoError(failpoint.Disable("github.com/tikv/pd/server/raftclusterIsBusy"))
+}()
post(t, re, addr1+"/pd/api/v1/leader/resign", "")
leader2 := waitLeaderChange(re, cluster, leader1)
t.Log("leader2:", leader2)
re.NotEqual(leader1, leader2)
-close(ch)
-re.NoError(failpoint.Disable("github.com/tikv/pd/server/raftclusterIsBusy"))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ch := make(chan struct{}) | |
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/raftclusterIsBusy", func() { <-ch })) | |
| ch := make(chan struct{}) | |
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/raftclusterIsBusy", func() { <-ch })) | |
| defer func() { | |
| select { | |
| case <-ch: | |
| default: | |
| close(ch) | |
| } | |
| re.NoError(failpoint.Disable("github.com/tikv/pd/server/raftclusterIsBusy")) | |
| }() | |
| post(t, re, addr1+"/pd/api/v1/leader/resign", "") | |
| leader2 := waitLeaderChange(re, cluster, leader1) | |
| t.Log("leader2:", leader2) | |
| re.NotEqual(leader1, leader2) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/server/member/member_test.go` around lines 242 - 243, The failpoint
enabled via failpoint.EnableCall("github.com/tikv/pd/server/raftclusterIsBusy",
...) is not guaranteed to be disabled on early test exit; after successfully
enabling the failpoint (the call creating ch and calling failpoint.EnableCall)
immediately register cleanup with defer: defer
failpoint.Disable("github.com/tikv/pd/server/raftclusterIsBusy") and also defer
close(ch) (or otherwise signal the goroutine) so the failpoint is always
disabled and the channel closed even if the test fails early.
| ch := make(chan struct{}) | ||
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayProcess", func() { <-ch })) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Inspect the test function containing lines 225-260 to understand full context
sed -n '180,280p' tests/server/server_test.goRepository: tikv/pd
Length of output: 2934
🏁 Script executed:
#!/bin/bash
# Confirm: are there any function-level defer statements for this test?
sed -n '1,225p' tests/server/server_test.go | grep -n "func\|defer" | tail -20Repository: tikv/pd
Length of output: 665
🏁 Script executed:
#!/bin/bash
# Inspect the exact line numbers: find TestGRPCRateLimit and pinpoint failpoint operations
grep -n "failpoint.EnableCall\|failpoint.Disable\|close(ch)" tests/server/server_test.go | grep -A5 -B5 "delayProcess"Repository: tikv/pd
Length of output: 239
🏁 Script executed:
#!/bin/bash
# Final confirmation: check what assertions are between failpoint enable and cleanup
sed -n '226,261p' tests/server/server_test.goRepository: tikv/pd
Length of output: 1099
Guarantee failpoint and channel cleanup on every test exit path
At Line 226, the failpoint enable is not guarded by deferred cleanup. Multiple assertions follow (lines 251–252), and if any fails, the failpoint remains enabled and the channel remains unclosed. This causes:
- Failpoint state to leak and pollute subsequent tests
- Goroutines spawned at lines 227–250 to remain blocked on
<-ch
Wrap cleanup with defer and sync.Once to ensure it runs regardless of assertion failures:
Proposed fix
ch := make(chan struct{})
+ var releaseOnce sync.Once
+ release := func() {
+ releaseOnce.Do(func() {
+ close(ch)
+ re.NoError(failpoint.Disable("github.com/tikv/pd/server/delayProcess"))
+ })
+ }
re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayProcess", func() { <-ch }))
+ defer release()
var wg sync.WaitGroup
wg.Add(2)
go func() {
defer wg.Done()
resp, err := grpcPDClient.GetRegion(context.Background(), &pdpb.GetRegionRequest{
Header: &pdpb.RequestHeader{ClusterId: leaderServer.GetClusterID()},
RegionKey: []byte(""),
})
re.Empty(resp.GetHeader().GetError())
if err != nil {
errCh <- err.Error()
} else {
okCh <- struct{}{}
}
}()
grpcPDClient1, conn1 := testutil.MustNewGrpcClient(re, addr)
defer conn1.Close()
go func() {
defer wg.Done()
resp, err := grpcPDClient1.GetRegion(context.Background(), &pdpb.GetRegionRequest{
Header: &pdpb.RequestHeader{ClusterId: leaderServer.GetClusterID()},
RegionKey: []byte(""),
})
re.Empty(resp.GetHeader().GetError())
if err != nil {
errCh <- err.Error()
} else {
okCh <- struct{}{}
}
}()
errStr := <-errCh
re.Contains(errStr, "rate limit exceeded")
- close(ch)
- re.NoError(failpoint.Disable("github.com/tikv/pd/server/delayProcess"))
+ release()
<-okCh
wg.Wait()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/server/server_test.go` around lines 225 - 226, The test enables a
failpoint with failpoint.EnableCall("github.com/tikv/pd/server/delayProcess",
func() { <-ch }) and creates a channel ch but does not guarantee cleanup on test
exit; wrap the enable + channel creation with a deferred cleanup that uses a
sync.Once to both disable the failpoint
(failpoint.Disable("github.com/tikv/pd/server/delayProcess")) and close ch so
that regardless of assertion failures or panics the failpoint is disabled and
the blocked goroutines are released; place the sync.Once and the defer
immediately after creating ch and calling failpoint.EnableCall to ensure the
cleanup always runs.
| } | ||
|
|
||
| re.NoError(failpoint.Enable("github.com/tikv/pd/pkg/tso/delaySyncTimestamp", `return(true)`)) | ||
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp", func() { time.Sleep(time.Second) })) |
There was a problem hiding this comment.
Register failpoint disable immediately after enabling
Line 138 enables a failpoint but cleanup is only executed at Line 156. Any earlier fatal assertion can leak the failpoint into subsequent tests.
Suggested fix
re.NoError(failpoint.EnableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp", func() { time.Sleep(time.Second) }))
+defer func() {
+ re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/tso/delaySyncTimestamp"))
+}()
...
-re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/tso/delaySyncTimestamp"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/server/tso/tso_test.go` at line 138, Enable the failpoint with
failpoint.EnableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp", ...) and
immediately register cleanup to disable it (e.g., defer
failpoint.DisableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp") or
t.Cleanup(func(){
failpoint.DisableCall("github.com/tikv/pd/pkg/tso/delaySyncTimestamp") })) right
after the EnableCall succeeds so any early test failures won't leak the
failpoint into other tests.
| re.NoError(failpoint.Enable("github.com/tikv/pd/pkg/keyspace/acceleratedAllocNodes", `return(true)`)) | ||
| re.NoError(failpoint.Enable("github.com/tikv/pd/pkg/tso/fastGroupSplitPatroller", `return(true)`)) | ||
| re.NoError(failpoint.Enable("github.com/tikv/pd/server/delayStartServerLoop", `return(true)`)) | ||
| re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", func() { time.Sleep(2 * time.Second) })) |
There was a problem hiding this comment.
Use cleanup-based failpoint teardown to avoid cross-test pollution
Line 55 enables a failpoint, but teardown depends on reaching Line 119. If any re.NoError fails earlier, this failpoint can remain enabled and affect later tests. Please register disable with t.Cleanup (or immediate defer) at enable time.
Suggested fix
re.NoError(failpoint.Enable("github.com/tikv/pd/pkg/keyspace/acceleratedAllocNodes", `return(true)`))
re.NoError(failpoint.Enable("github.com/tikv/pd/pkg/tso/fastGroupSplitPatroller", `return(true)`))
re.NoError(failpoint.EnableCall("github.com/tikv/pd/server/delayStartServerLoop", func() { time.Sleep(2 * time.Second) }))
+t.Cleanup(func() {
+ re.NoError(failpoint.Disable("github.com/tikv/pd/server/delayStartServerLoop"))
+ re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/tso/fastGroupSplitPatroller"))
+ re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/keyspace/acceleratedAllocNodes"))
+})
...
-re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/keyspace/acceleratedAllocNodes"))
-re.NoError(failpoint.Disable("github.com/tikv/pd/pkg/tso/fastGroupSplitPatroller"))
-re.NoError(failpoint.Disable("github.com/tikv/pd/server/delayStartServerLoop"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/pd-ctl/tests/keyspace/keyspace_test.go` at line 55, The failpoint is
enabled via failpoint.EnableCall (wrapped by re.NoError) but not guaranteed to
be disabled if the test fails early; immediately after enabling the failpoint
(the call to failpoint.EnableCall), register a teardown that disables it (use
failpoint.DisableCall or failpoint.Disable) via t.Cleanup(...) or defer to
ensure the failpoint is always turned off for that test; update the test that
calls failpoint.EnableCall to add this cleanup registration right after the
successful enable so later tests are not polluted.
| @@ -125,8 +125,10 @@ func TestExitWatch(t *testing.T) { | |||
| re.NoError(failpoint.Enable("github.com/tikv/pd/pkg/utils/etcdutil/fastTick", "return(true)")) | |||
There was a problem hiding this comment.
Does this pr intend to exhaustively cover the migration of all failpoint.Injects? Found one missing here
There was a problem hiding this comment.
This PR only covers three "simple" patterns: empty/nil callbacks, sleep-only callbacks, and panic-only callbacks. The fastTick failpoints here modify local variables inside the closure (e.g., unhealthyTimeout = 5 * time.Second), which requires a different migration strategy (passing pointers via InjectCall). Will handle these in a follow-up PR.
|
@YuhaoZhang00: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: YuhaoZhang00 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #10300 +/- ##
==========================================
+ Coverage 78.78% 78.83% +0.05%
==========================================
Files 525 527 +2
Lines 70824 70890 +66
==========================================
+ Hits 55796 55889 +93
+ Misses 11004 10995 -9
+ Partials 4024 4006 -18
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
What problem does this PR solve?
Issue Number: ref #10297
What is changed and how does it work?
Migrate 19 simple
failpoint.Inject()call sites to the newerfailpoint.InjectCall()/failpoint.EnableCall()APIs. This eliminatesclosure bodies from production code and consolidates test injection logic
in test files with type-safe callbacks.
Categories migrated:
Inject("name", func(){})/nil→ bareInjectCall("name")time.Sleepto test-sideEnableCallpanicto test-sideEnableCallFor "pause" failpoints, the pattern is converted from
failpoint.Enable(path, "pause")to a channel-basedfailpoint.EnableCall(path, func() { <-ch })approach.Check List
Tests
Release note
Summary by CodeRabbit
Tests
Chores