feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns #84

li041 · 2025-12-11T08:10:28Z

This PR improves the SMP and IPI infrastructure and introduces per-task CPU residency
tracking to reduce unnecessary cross-core TLB shootdowns.

Background

Cross-core TLB flushes are expensive and should be avoided when possible.
Previously, the kernel lacked sufficient information to determine which CPUs
a task had actually run on, resulting in overly conservative TLB shootdowns.
In addition, the underlying IPI mechanism had several limitations that
prevented reliable cross-core synchronization.

What these PRs do

Fix and enhance the low-level IPI mechanism
- Fix bugs in send_ipi where notifications could only target a single CPU
- Improve IPI synchronization semantics to support reliable cross-core coordination
Enable cross-core TLB flush
- Provide the required IPI-based infrastructure for remote TLB invalidation
- Support lazy TLB flush when possible
Track per-task CPU residency
- Record the set of CPUs a task has run on
- Use this information to limit TLB shootdowns to relevant CPUs only

Result

Avoids unnecessary cross-core TLB shootdowns
Reduces IPI traffic and synchronization overhead
Improves scalability on SMP systems

Related work

Copilot

Pull request overview

This PR adds per-task CPU residency tracking to optimize TLB (Translation Lookaside Buffer) shootdown operations. By tracking which CPUs a process has run on, the system can avoid sending unnecessary cross-core TLB invalidation IPIs to CPUs that haven't executed the process.

Key changes:

Added AxCpuMask tracking to ProcessData to record CPU residency
Implemented on_enter/on_leave hooks to update CPU mask when tasks are scheduled on/off CPUs
Added on_cpu_mask() method to query which CPUs a task has run on

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-11T08:13:48Z

core/src/task.rs

+        let mut on_cpu_mask = self.proc_data.on_cpu_mask.write();
+        on_cpu_mask.set(axhal::percpu::this_cpu_id(), false);


Race condition: Clearing the CPU bit in on_leave creates a race when multiple threads from the same process run concurrently. Since ProcessData is shared via Arc among all threads in a process, if Thread A from the process leaves CPU X while Thread B from the same process is still running (or about to run) on CPU X, the bit for CPU X will be incorrectly cleared. The comment on line 195 says "The CPUs on which the task has run" which suggests historical tracking, but the current implementation attempts to track "currently running" which requires more sophisticated synchronization or reference counting per CPU.

Copilot · 2025-12-11T08:13:48Z

core/src/task.rs

+    /// The CPUs on which the task has run.
+    pub on_cpu_mask: RwLock<AxCpuMask>,


The documentation states "The CPUs on which the task has run" which implies historical tracking of all CPUs where the process has executed. However, the implementation in on_leave (line 157-158) clears CPU bits when leaving, suggesting it tracks "currently running CPUs" instead. This discrepancy between documentation and implementation needs to be resolved. Clarify whether this should track historical CPU usage (never clear bits) or current CPU presence (requires reference counting per CPU to handle multiple threads).

…re TLB shootdowns

Copilot AI review requested due to automatic review settings December 11, 2025 08:10

Copilot started reviewing on behalf of li041 December 11, 2025 08:10 View session

Copilot AI reviewed Dec 11, 2025

View reviewed changes

AsakuraMizu mentioned this pull request Dec 16, 2025

feat(ipi): tie ipi to smp feature, add sync mechanism for ipi and support lazy flush tlb Starry-OS/arceos#29

Draft

AsakuraMizu marked this pull request as draft December 16, 2025 07:18

feat(smp): track per-task CPU residency to avoid unnecessary cross-co…

bee9a77

…re TLB shootdowns

li041 force-pushed the main branch from 9bf9b81 to bee9a77 Compare December 23, 2025 06:55

li041 marked this pull request as ready for review December 23, 2025 07:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns #84

feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns #84

li041 commented Dec 11, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 11, 2025

Uh oh!

Copilot AI Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		let mut on_cpu_mask = self.proc_data.on_cpu_mask.write();
		on_cpu_mask.set(axhal::percpu::this_cpu_id(), false);

		/// The CPUs on which the task has run.
		pub on_cpu_mask: RwLock<AxCpuMask>,

feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns #84

Are you sure you want to change the base?

feat(smp): track per-task CPU residency to avoid unnecessary cross-core TLB shootdowns #84

Conversation

li041 commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

What these PRs do

Result

Related work

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

li041 commented Dec 11, 2025 •

edited

Loading