Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new runtime-only "PD Flaky Fix" skill and a maintained flaky-fix playbook, plus minor codex/gitignore changes to track the skills index. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent
participant Repo as "Repository / Workspace"
participant GitHub as "GitHub Issues & PRs"
participant CI as "CI / Verification"
Agent->>GitHub: Search for flaky PD/TiKV issues
GitHub-->>Agent: Return candidate issue
Agent->>Repo: Create branch, run local tests & root-cause analysis
Repo-->>Agent: Evidence & test results
Agent->>Agent: Apply minimal TDD patch
Agent->>Repo: Run verification commands
Repo-->>Agent: Verification results
Agent->>GitHub: Open draft PR (target: upstream master) with required output contract
GitHub->>CI: Trigger CI for PR
CI-->>GitHub: Report verification status
GitHub-->>Agent: PR created / mergeable state
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #10287 +/- ##
==========================================
+ Coverage 78.78% 78.85% +0.07%
==========================================
Files 527 527
Lines 70916 70920 +4
==========================================
+ Hits 55870 55927 +57
+ Misses 11026 10984 -42
+ Partials 4020 4009 -11
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
897c922 to
71761a4
Compare
|
/ok-to-retest |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
.codex/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh (2)
134-160: Consider documenting the 160 PR limit for gh enrichment.The magic number
160limits how many PRs are enriched via the GitHub API. Consider adding a comment explaining this limit (e.g., rate limiting, performance) or making it configurable via an environment variable.📝 Suggested improvement
+# Limit API enrichment to avoid rate limits and keep refresh fast. +GH_ENRICH_LIMIT = 160 + if gh_enhance: - for row in rows[:160]: + for row in rows[:GH_ENRICH_LIMIT]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.codex/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh around lines 134 - 160, The hardcoded limit rows[:160] in the gh_enhance enrichment loop is a magic number; make it configurable and document it by replacing the literal with a variable (e.g., pr_enrich_limit) read from an environment variable with a sensible default (160) and add a brief comment explaining it's to control GitHub API/rate/perf impact; update any references in the same block that assume 160 and keep functions like extract_test_names and classify unchanged.
260-260:datetime.utcnow()is deprecated in Python 3.12+ and should be replaced withdatetime.now(datetime.timezone.utc).The deprecated method returns a naive datetime that can be misinterpreted as local time by other operations. Using
datetime.now(datetime.timezone.utc)creates a timezone-aware UTC datetime, which is the recommended approach for forward compatibility.Note: The proposed fix changes the output format from
2026-03-10T14:30:45.123456Zto2026-03-10T14:30:45.123456+00:00(both are valid ISO 8601 UTC representations). If the literal "Z" suffix is required, use.isoformat().replace('+00:00', 'Z')instead.📝 Proposed fix
-summary_lines.append(f"- Generated at: {datetime.datetime.utcnow().isoformat()}Z") +summary_lines.append(f"- Generated at: {datetime.datetime.now(datetime.timezone.utc).isoformat()}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.codex/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh at line 260, Replace the deprecated datetime.datetime.utcnow() call used in the summary_lines.append f-string with a timezone-aware UTC timestamp: call datetime.datetime.now(datetime.timezone.utc) (or, if you need the trailing "Z", use .isoformat().replace('+00:00', 'Z') on that result); ensure datetime.timezone is imported or referenced (e.g., datetime.timezone) so the change applies to the summary_lines.append(f"- Generated at: ...") expression where the timestamp is built.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.codex/skills/pd-flaky-corpus-refresh/references/corpus-method.md:
- Around line 34-35: The docs claim `panic_guard` triggers on "panic or nil" but
the implementation in the classify function of build_flaky_pr_corpus.sh only
checks for "panic"; make them consistent by updating the documentation in
corpus-method.md (the `panic_guard` bullet) to state it is triggered by `panic`
only (or alternatively modify the `classify` function to also detect `nil` if
you intend that behavior); reference the `panic_guard` and `race_elimination`
labels and the `classify` function to locate the code to change.
In @.codex/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh:
- Around line 54-75: The classify function currently only matches "panic" to
assign the panic_guard label but corpus-method.md expects it to also match
"nil"; update the classify function so the panic check includes nil (e.g.,
change the condition that adds "panic_guard" to check for "panic" or "nil" in
title_lower and/or kws), or alternatively update corpus-method.md to reflect the
current behavior—locate the classify function and the "panic_guard" branch and
add the "nil" check to align implementation with the documentation.
In @.codex/skills/pd-flaky-corpus-refresh/SKILL.md:
- Around line 13-14: Replace the hardcoded absolute paths
`/Users/jiangxianjie/.codex/skills/pd-flaky-fix/references/flaky-pr-corpus.jsonl`
and
`/Users/jiangxianjie/.codex/skills/pd-flaky-fix/references/flaky-fix-playbook.md`
with repository-relative or environment-driven paths; update SKILL.md to
reference `./references/flaky-pr-corpus.jsonl` and
`./references/flaky-fix-playbook.md` (or use an env var like
$REPO_ROOT/references/... and document the required REPO_ROOT), and ensure any
scripts or readers that consume these paths (search for functions/classes that
parse SKILL.md or load these files) are updated to resolve relative paths
against the repository root or the provided env var.
- Around line 30-46: Replace hardcoded absolute paths in the workflow examples
with repository-relative paths: update the build command to invoke
.codex/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh (remove
the absolute PD_REPO=/Users/... prefix or change it to a relative PD_REPO if
needed), and change the verification and summary commands to use
.codex/skills/pd-flaky-fix/references/flaky-pr-corpus.jsonl,
.codex/skills/pd-flaky-fix/references/flaky-fix-playbook.md, and
.codex/skills/pd-flaky-corpus-refresh/references/last-refresh-summary.md
respectively so all examples reference repository-root relative paths instead of
/Users/jiangxianjie/...
In @.codex/skills/pd-flaky-fix/references/flaky-fix-playbook.md:
- Line 104: Fix the typo in the markdown heading: change "cooridinator" to
"coordinator" in the line starting with "#2625: cooridinator: fix the issue
caused a deadlock when deleting scheduler" so the title reads "#2625:
coordinator: fix the issue caused a deadlock when deleting scheduler".
In @.codex/skills/pd-flaky-fix/references/pd-flaky-workflow.md:
- Around line 54-57: Replace the hardcoded absolute paths in the ripgrep
commands with repo-root relative paths: update the two commands that run rg -n
"<TestName>|<keyword>" and rg -n "\"pr_number\"|\"fix_patterns\"" to point to
.codex/skills/pd-flaky-fix/references/flaky-fix-playbook.md and
.codex/skills/pd-flaky-fix/references/flaky-pr-corpus.jsonl respectively (i.e.,
remove the /Users/jiangxianjie/ prefix so the commands use relative paths).
---
Nitpick comments:
In @.codex/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh:
- Around line 134-160: The hardcoded limit rows[:160] in the gh_enhance
enrichment loop is a magic number; make it configurable and document it by
replacing the literal with a variable (e.g., pr_enrich_limit) read from an
environment variable with a sensible default (160) and add a brief comment
explaining it's to control GitHub API/rate/perf impact; update any references in
the same block that assume 160 and keep functions like extract_test_names and
classify unchanged.
- Line 260: Replace the deprecated datetime.datetime.utcnow() call used in the
summary_lines.append f-string with a timezone-aware UTC timestamp: call
datetime.datetime.now(datetime.timezone.utc) (or, if you need the trailing "Z",
use .isoformat().replace('+00:00', 'Z') on that result); ensure
datetime.timezone is imported or referenced (e.g., datetime.timezone) so the
change applies to the summary_lines.append(f"- Generated at: ...") expression
where the timestamp is built.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 88cc8ff3-5b23-4055-8acb-6cdb06fd9d59
📒 Files selected for processing (8)
.codex/skills/pd-flaky-corpus-refresh/SKILL.md.codex/skills/pd-flaky-corpus-refresh/references/corpus-method.md.codex/skills/pd-flaky-corpus-refresh/references/last-refresh-summary.md.codex/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh.codex/skills/pd-flaky-fix/SKILL.md.codex/skills/pd-flaky-fix/references/flaky-fix-playbook.md.codex/skills/pd-flaky-fix/references/flaky-pr-corpus.jsonl.codex/skills/pd-flaky-fix/references/pd-flaky-workflow.md
.agents/skills/pd-flaky-corpus-refresh/references/corpus-method.md
Outdated
Show resolved
Hide resolved
.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh:
- Around line 59-60: The current heuristic only checks title_lower for the
substring "panic" before appending "panic_guard" to pats, which misses
nil-dereference fixes; update the conditional around title_lower and pats so it
also matches common nil-related tokens (e.g., "nil", "nil pointer", "nil deref",
"nil dereference", "nil-pointer", "nilptr", "nil check") and append
"panic_guard" when any of those appear; locate the conditional that references
title_lower and pats and broaden the match set so nil-driven fixes are
classified the same as explicit "panic" fixes.
In @.agents/skills/pd-flaky-corpus-refresh/SKILL.md:
- Around line 12-15: Replace hardcoded user-specific absolute paths with
repo-root/skills-root based variables: use SKILLS_ROOT and PD_REPO environment
variables (falling back to git rev-parse --show-toplevel) where the script
invocation references build_flaky_pr_corpus.sh and where file checks reference
flaky-pr-corpus.jsonl and flaky-fix-playbook.md; update the commands that call
pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh to export SKILLS_ROOT
and PD_REPO variables and switch wc/rg invocations to use
"$SKILLS_ROOT/pd-flaky-fix/references/flaky-pr-corpus.jsonl" and
"$SKILLS_ROOT/pd-flaky-fix/references/flaky-fix-playbook.md" instead of the
hardcoded /Users/jiangxianjie/... paths.
In @.agents/skills/pd-flaky-fix/references/pd-flaky-workflow.md:
- Around line 55-57: Replace the hardcoded user-specific absolute paths used in
the two ripgrep invocations that search for "<TestName>|<keyword>" and
"\"pr_number\"|\"fix_patterns\"" with a portable SKILLS_ROOT-based lookup:
compute SKILLS_ROOT defaulting to the repository root (via git rev-parse) if
unset and use "$SKILLS_ROOT/pd-flaky-fix/references/..." for both search targets
so the commands work for other users and CI.
- Around line 88-92: The PR creation command hardcodes the fork owner in the
--head argument ("--head okJiang:$(git branch --show-current)"), which breaks
for other users; change it to let gh infer the head or use only the current
branch name (e.g., replace the --head value with the branch name from $(git
branch --show-current) or omit --head entirely) so the command becomes generic
and works for any fork/current user when running the gh pr create invocation.
In @.agents/skills/pd-flaky-fix/SKILL.md:
- Around line 68-71: Update the verification gate text to require a
repo-baseline validation before PR creation by adding a requirement to run `make
check` plus the narrowest relevant `go test` invocation (using appropriate build
tags) for the touched scope; keep the focused `make gotest` step but change the
minimum to at least `make basic-test` for touched packages and ensure full
commands and outcomes are recorded (do not claim success without command
evidence).
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 583f8ba5-84b0-484e-bbe8-cc6ef883f5bf
📒 Files selected for processing (10)
.agents/skills/pd-flaky-corpus-refresh/SKILL.md.agents/skills/pd-flaky-corpus-refresh/references/corpus-method.md.agents/skills/pd-flaky-corpus-refresh/references/last-refresh-summary.md.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh.agents/skills/pd-flaky-fix/SKILL.md.agents/skills/pd-flaky-fix/references/flaky-fix-playbook.md.agents/skills/pd-flaky-fix/references/flaky-pr-corpus.jsonl.agents/skills/pd-flaky-fix/references/pd-flaky-workflow.md.codex/skills.gitignore
✅ Files skipped from review due to trivial changes (1)
- .codex/skills
.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh (1)
134-160: Consider logging suppressed exceptions for debugging.The GH enhancement loop appropriately tolerates API failures (lines 145-146, 159-160), maintaining the
local_gitbaseline when enrichment fails. This aligns with SKILL.md's degradation handling.However, the bare
except Exception: continueon line 159-160 silently swallows all errors, which could make debugging difficult if unexpected issues occur during enrichment.💡 Optional: Add debug logging for suppressed exceptions
except Exception: + # Uncomment for debugging: print(f"GH enrich failed for PR #{pr}: {e}", file=sys.stderr) continueOr use a counter to report how many enrichment attempts failed in the summary.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh around lines 134 - 160, The bare except in the GH enhancement loop silently swallows errors; update the except block in the for-loop that calls subprocess.run (inside the gh_enhance branch) to log the caught exception and context (e.g., pr number) instead of just continuing — use the existing logger or print to stderr and include exception info (stack/str) and consider incrementing a local failure counter to report how many enrichments failed; ensure the exception handling still continues to the next row so degradation behavior is preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh:
- Around line 134-160: The bare except in the GH enhancement loop silently
swallows errors; update the except block in the for-loop that calls
subprocess.run (inside the gh_enhance branch) to log the caught exception and
context (e.g., pr number) instead of just continuing — use the existing logger
or print to stderr and include exception info (stack/str) and consider
incrementing a local failure counter to report how many enrichments failed;
ensure the exception handling still continues to the next row so degradation
behavior is preserved.
ℹ️ Review info
Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9c98ac19-d0d8-4343-a63d-db2cde7992bd
📒 Files selected for processing (4)
.agents/skills/pd-flaky-corpus-refresh/SKILL.md.agents/skills/pd-flaky-corpus-refresh/scripts/build_flaky_pr_corpus.sh.agents/skills/pd-flaky-fix/SKILL.md.agents/skills/pd-flaky-fix/references/pd-flaky-workflow.md
|
/retest |
1 similar comment
|
/retest |
Signed-off-by: okjiang <819421878@qq.com>
Signed-off-by: okjiang <819421878@qq.com>
Signed-off-by: okjiang <819421878@qq.com>
Drop the corpus refresh bundle and raw corpus artifact from this PR, inline the runtime workflow into pd-flaky-fix, and simplify the checked-in playbook so the remaining skill stays focused on the reviewed flaky-fix flow. Signed-off-by: okjiang <819421878@qq.com>
d0a6d78 to
2475c70
Compare
|
/retest |
Signed-off-by: okjiang <819421878@qq.com>
What problem does this PR solve?
PD needs a repository-maintained flaky-fix skill that can be reviewed with
the same conventions as the codebase. The earlier version of this PR bundled
infrequent maintenance tooling and a raw generated corpus alongside the runtime
skill, which made the review scope larger than necessary and left some
workflow guidance split across multiple files.
Issue Number: ref #10159
What is changed and how does it work?
This PR keeps only the runtime
pd-flaky-fixskill in the repository andscopes the reviewed content to that flow. The skill now contains its issue
selection, dedupe, branch setup, verification, and draft-PR commands directly
in
SKILL.md, so reviewers can audit the full runtime procedure in one place.The playbook is also simplified so shared guidance is stated once and each
pattern keeps only its fit signals and minimal fix shape.
The maintenance-only
pd-flaky-corpus-refreshbundle and the rawflaky-pr-corpus.jsonlartifact are removed from this iteration. The checked-inplaybook remains as the reviewed summary artifact, and the skill now explicitly
falls back to direct evidence when no historical analog applies. The playbook
also states that representative PR titles are kept verbatim from upstream for
traceability, even when the source title contains a typo.
The
.codex/skillslink and.gitignoreexception remain so the in-repositoryskill is discoverable from the Codex runtime without tracking unrelated local
Codex state.
Check List
Tests
Manual verification:
Code changes
Side effects
Related changes
pingcap/docs/pingcap/docs-cn: N/Apingcap/tiup: N/ARelease note
Summary by CodeRabbit
Documentation
Chores