Skip to content

ABI: Convert dual CR3 to single CR3 #19

@laijs

Description

@laijs

Background:

PVM is a pagetable-based virtualization system where kernel and user space separation is implemented via separate page tables. Currently, a process’s address space is defined by the CR3 register and MSR_SWITCH_CR3 (a PVM virtual MSR). Address space switching is performed by the hypercall PVM_HC_LOAD_PGTBL, which loads both CR3 and MSR_SWITCH_CR3, and by swapping CR3 and MSR_SWITCH_CR3 when switching between kernel and user modes.

Goal

However, this ABI deviates from native x86 architecture and should be converted to use a single CR3, as is standard. Using a single CR3 does not eliminate separation — the hypervisor will manage two underlying shadow page tables to maintain proper kernel/user isolation. This change would also remove MSR_PVM_SWITCH_CR3 and the user_pgd argument from PVM_HC_LOAD_PGTBL.

Benefits of the current dual-CR3 design:
• The guest explicitly manages which pages belong to kernel CR3 and which to user CR3, taking responsibility for proper separation.
• It allows reuse of the existing Linux kernel KPTI (Kernel Page Table Isolation) logic inside the PVM guest — the main reason why the current dual-CR3 implementation is relatively simple.

Drawbacks of dual-CR3:
• It deviates from the native x86 architecture, making the ABI less clear.
• Future kernels may remove KPTI once CPUs affected by the Meltdown bug are obsolete (possibly in 10–20 years), making this approach unsustainable long-term.
• Wastes an extra 4 KB root page table per process in the guest.

After adopting a single-CR3 model:
Pros:
• Clear, native x86-compliant ABI.

Cons:
• More complex logic required in the hypervisor to carefully manage shadow page tables that distinguish between kernel and user mappings.
• The new implementation must go beyond simple KPTI and fully emulate native x86 behavior.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions