-
Notifications
You must be signed in to change notification settings - Fork 62
feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds per-task CPU residency tracking to optimize TLB (Translation Lookaside Buffer) shootdown operations. By tracking which CPUs a process has run on, the system can avoid sending unnecessary cross-core TLB invalidation IPIs to CPUs that haven't executed the process.
Key changes:
- Added
AxCpuMasktracking toProcessDatato record CPU residency - Implemented
on_enter/on_leavehooks to update CPU mask when tasks are scheduled on/off CPUs - Added
on_cpu_mask()method to query which CPUs a task has run on
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
core/src/task.rs
Outdated
| let mut on_cpu_mask = self.proc_data.on_cpu_mask.write(); | ||
| on_cpu_mask.set(axhal::percpu::this_cpu_id(), false); |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition: Clearing the CPU bit in on_leave creates a race when multiple threads from the same process run concurrently. Since ProcessData is shared via Arc among all threads in a process, if Thread A from the process leaves CPU X while Thread B from the same process is still running (or about to run) on CPU X, the bit for CPU X will be incorrectly cleared. The comment on line 195 says "The CPUs on which the task has run" which suggests historical tracking, but the current implementation attempts to track "currently running" which requires more sophisticated synchronization or reference counting per CPU.
| /// The CPUs on which the task has run. | ||
| pub on_cpu_mask: RwLock<AxCpuMask>, |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states "The CPUs on which the task has run" which implies historical tracking of all CPUs where the process has executed. However, the implementation in on_leave (line 157-158) clears CPU bits when leaving, suggesting it tracks "currently running CPUs" instead. This discrepancy between documentation and implementation needs to be resolved. Clarify whether this should track historical CPU usage (never clear bits) or current CPU presence (requires reference counting per CPU to handle multiple threads).
…re TLB shootdowns
This PR improves the SMP and IPI infrastructure and introduces per-task CPU residency
tracking to reduce unnecessary cross-core TLB shootdowns.
Background
Cross-core TLB flushes are expensive and should be avoided when possible.
Previously, the kernel lacked sufficient information to determine which CPUs
a task had actually run on, resulting in overly conservative TLB shootdowns.
In addition, the underlying IPI mechanism had several limitations that
prevented reliable cross-core synchronization.
What these PRs do
send_ipiwhere notifications could only target a single CPUResult
Related work