resource_group: set FillRate in non-trickle token bucket responses by JmPotato · Pull Request #10322 · tikv/pd

JmPotato · 2026-03-10T05:55:26Z

What problem does this PR solve?

Issue Number: Close #10251

When the resource group FillRate is much larger than actual consumption (e.g., FillRate=1,000,000 vs consumption≈200 RU/s), requests experience ~1s latency spikes.

Root cause: The server's request() and assignSlotTokens() never set FillRate in non-trickle responses. The client's modifyTokenCounter uses cfg.newFillRate = float64(bucket.GetSettings().FillRate), which gets 0. After the granted tokens are depleted, durationFromTokens() returns InfDuration, causing all subsequent OnRequestWait calls to fail with ErrClientResourceGroupThrottled or context.DeadlineExceeded.

What is changed and how does it work?

Set FillRate on all three non-trickle return paths in the server so the client limiter can refill tokens locally:

No-slot path in request() — when the slot doesn't exist (initial periodic report with requiredToken=0), return FillRate = gtb.getFillRate() (group-level fill rate).
requiredToken<=0 path in assignSlotTokens() — when the slot exists but no tokens are requested, return FillRate = ts.fillRate (per-slot fill rate).
Full-satisfy path in assignSlotTokens() — when the slot has enough capacity to fully satisfy the request, return FillRate = ts.fillRate (per-slot fill rate).

Trickle paths are intentionally left unchanged to avoid double-counting with the client's own trickle-based fill rate calculation (cfg.newFillRate = FillRate + granted/trickleTime).

Set FillRate on all three non-trickle return paths in the server:
1. No-slot path in request() — initial periodic report with no slot yet
2. requiredToken<=0 path in assignSlotTokens() — zero-demand requests
3. Full-satisfy path in assignSlotTokens() — capacity covers demand

Trickle paths are intentionally left unchanged to avoid double-counting
with the client's own trickle-based fill rate calculation.

Check List

Tests

Unit test
Integration test

Code changes

Has no configuration change
Has no HTTP API change
Has no persistent data change

Side effects

No performance regression
No increased code complexity
No breaking backward compatibility

Release note

Fix the issue that the resource group token bucket server does not set FillRate in non-trickle responses, which causes the client limiter to have fillRate=0 and results in ~1s latency spikes when the resource group FillRate is much larger than actual consumption.

Summary by CodeRabbit

Bug Fixes

Improved token bucket rate limiting to consistently propagate fill rates across all token grant scenarios, preventing double-counting issues and ensuring accurate token refilling behavior.

When the server grants tokens in non-trickle mode (capacity sufficient), the response previously carried FillRate=0. This caused the client limiter to have fillRate=0, making durationFromTokens() return InfDuration after the granted tokens were depleted, resulting in ~1s latency spikes. Set FillRate on all three non-trickle return paths in the server: 1. No-slot path in request() — initial periodic report with no slot yet 2. requiredToken<=0 path in assignSlotTokens() — zero-demand requests 3. Full-satisfy path in assignSlotTokens() — capacity covers demand Trickle paths are intentionally left unchanged to avoid double-counting with the client's own trickle-based fill rate calculation. Close tikv#10251 Signed-off-by: JmPotato <github@ipotato.me>

ti-chi-bot · 2026-03-10T05:55:33Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign andremouche for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-03-10T05:55:48Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 059f52cd-eb20-4479-a16d-306f36442df8

📥 Commits

Reviewing files that changed from the base of the PR and between 58252c5 and c80979d.

📒 Files selected for processing (3)

pkg/mcs/resourcemanager/server/token_buckets.go
pkg/mcs/resourcemanager/server/token_buckets_test.go
tests/integrations/mcs/resourcemanager/resource_manager_test.go

📝 Walkthrough

Walkthrough

The pull request ensures FillRate is consistently propagated in token bucket settings during slot creation and token assignment across multiple code paths. When a new slot is created or tokens are assigned, the resulting TokenBucket.Settings now includes the appropriate FillRate value, with special handling for trickle versus non-trickle response modes.

Changes

Cohort / File(s)	Summary
Token Bucket FillRate Propagation `pkg/mcs/resourcemanager/server/token_buckets.go`	Added `FillRate` field assignment to `TokenBucket.Settings` in three scenarios: when creating new slots in `GroupTokenBucket.request()`, and when assigning tokens in `tokenSlot.assignSlotTokens()` for both zero-requirement and direct-grant cases.
FillRate Propagation Tests `pkg/mcs/resourcemanager/server/token_buckets_test.go`	Added assertions to verify `FillRate` propagation behavior, ensuring non-trickle responses include the slot's `FillRate` and trickle responses zero it out to prevent double-counting.
Integration Test Coverage `tests/integrations/mcs/resourcemanager/resource_manager_test.go`	Added `TestNonTrickleFillRate` integration test with two scenarios testing `FillRate` propagation in non-trickle token grant paths for resource groups.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

resourcemanager: assign tokens to the slot whose ru tracker isn't initialized #10181: Modifies the same token_buckets.go file with related token allocation and slot initialization behavior, including assignSlotTokens and GroupTokenBucket request paths.

Suggested labels

size/L, release-note-none, lgtm, approved

Suggested reviewers

rleungx
disksing

Poem

🐰 A fill rate lost in the token stream,
Now flowing through each slot with gleam,
No more the empty bucket's cry,
When trickles pause and rates run dry! 🪣✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: setting FillRate in non-trickle token bucket responses for resource groups.
Description check	✅ Passed	The PR description thoroughly addresses the template requirements: clearly states the linked issue, explains the problem and solution with detailed technical context, includes commit message, and documents test coverage, code changes, and release notes.
Linked Issues check	✅ Passed	The PR successfully addresses issue `#10251` by setting FillRate on all three non-trickle response paths (no-slot path, requiredToken<=0 path, and full-satisfy path) to enable client-side local token refilling and prevent the zero-fillrate starved state.
Out of Scope Changes check	✅ Passed	All changes are tightly scoped to the issue: FillRate propagation in non-trickle paths in token_buckets.go, corresponding test additions in token_buckets_test.go and resource_manager_test.go, with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ti-chi-bot · 2026-03-10T06:08:54Z

@JmPotato: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-unit-test-next-gen-1	`c80979d`	link	true	`/test pull-unit-test-next-gen-1`
pull-unit-test-next-gen-2	`c80979d`	link	true	`/test pull-unit-test-next-gen-2`
pull-unit-test-next-gen-3	`c80979d`	link	true	`/test pull-unit-test-next-gen-3`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has signed the dco. labels Mar 10, 2026

ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resource_group: set FillRate in non-trickle token bucket responses#10322

resource_group: set FillRate in non-trickle token bucket responses#10322
JmPotato wants to merge 1 commit intotikv:masterfrom
JmPotato:fix-10251

JmPotato commented Mar 10, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

coderabbitai bot commented Mar 10, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JmPotato commented Mar 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how does it work?

Check List

Release note

Summary by CodeRabbit

Bug Fixes

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

coderabbitai bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

ti-chi-bot bot commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JmPotato commented Mar 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 10, 2026 •

edited

Loading