Intel Arc Pro B60 - Host Freeze with VFIO Passthrough and ReBAR Enabled

# Intel Arc Pro B60 - VFIO Passthrough Issue and Small BAR Compute Question

## Description

I am experiencing host system freezes when using Intel Arc Pro B60 24GB with VFIO/KVM passthrough on Proxmox VE. I've done some investigation and would like to ask if there are any known workarounds or if my assumptions about the cause are correct.

---

## Environment

### Proxmox Host

| Component | Details |
|-----------|---------|
| **GPU** | Intel Arc Pro B60 24GB (PCI ID: 8086:e211) |
| **GPU Firmware** | BMG_21.1174 |
| **GOP (vBIOS)** | 23.0.1061 |
| **CPU** | Intel Core Ultra 7 265K |
| **Motherboard** | ASUS PRIME Z890-P WIFI-CSM (BIOS 2401) |
| **RAM** | 96GB DDR5 |
| **Host OS** | Proxmox VE 9.x |
| **Host Kernel** | 6.17.4-1-pve |

### Guest VM

| Component | Details |
|-----------|---------|
| **OS** | Ubuntu 24.04.3 LTS |
| **Kernel** | 6.14.0-1008-intel |
| **Driver** | xe |

---

## What I Observed

### With ReBAR Enabled

**At VM startup**, I see this warning:

```
QEMU: kvm: warning: vfio_container_dma_map(0x380000000000, 0x800000000, ...) = -22 (Invalid argument)
0000:04:00.0: PCI peer-to-peer transactions on BARs are not supported
```

**BAR configuration** (from Proxmox host):

```
Region 2: Memory at a000000000 (64-bit, prefetchable) [size=32G]
BAR 2: current size: 32GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB 16GB 32GB
```

**IOMMU configuration** (from Proxmox host):

```
DMAR: Host address width 42
```

**After 6-24 hours**, the host freezes. I observed three different freeze behaviors:

1. **Full Host Freeze**: Entire Proxmox becomes unresponsive, requires hard reset
2. **Partial Host Freeze**: VMs continue running, web UI partially works, but shell/SSH becomes inaccessible
3. **VM Freeze**: VM shows "running (internal-error)", host stays responsive

**Kernel soft lockup** (captured from console before host became unresponsive):

```
[33275.695601] watchdog: BUG: soft lockup - CPU#14 stuck for 5212s! [ps:360867]
[33255.695867]  ? x64_sys_call+0x1151/0x2330
[33255.695868]  ? do_syscall_64+0xb8/0xa30
[33255.695869]  ? x64_sys_call+0x2093/0x2330
[33255.695869]  ? do_syscall_64+0xb8/0xa30
[33255.695870]  ? count_memcg_events+0xd7/0x1a0
[33255.695872]  ? handle_mm_fault+0x254/0x370
[33255.695874]  ? do_user_addr_fault+0x2f8/0x830
[33255.695875]  ? irqentry_exit_to_user_mode+0x2e/0x290
[33255.695877]  ? irqentry_exit+0x43/0x50
[33255.695878]  ? exc_page_fault+0x90/0x1b0
[33255.695879]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
```

The CPU was stuck for ~87 minutes (5212 seconds) before the watchdog reported it.

### With ReBAR Manually Disabled

To test, I manually limited the BAR to 256MB via VFIO-PCI rebind on Proxmox host (and disabled ReBAR option in BIOS): 

```bash
echo "0000:04:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind
echo "0000:04:00.0" > /sys/bus/pci/drivers/vfio-pci/bind

lspci -vvv -s 04:00.0 | grep "Region 2"
# Region 2: Memory at a000000000 (64-bit, prefetchable) [size=256M]

qm start 101

dmesg | tail -20 | grep -i "vfio\|dma\|p2p"
# [611.243671] vfio-pci 0000:04:00.0: resetting
# [611.350629] vfio-pci 0000:04:00.0: reset done
# [611.375782] vfio-pci 0000:05:00.0: enabling
```

**Result**: The host no longer freezes.

However, inside the VM I see:

```
$ lspci -vvs 01:00.0 | grep Region
    Region 0: Memory at 80000000 (64-bit, non-prefetchable) [size=16M]
    Region 2: Memory at 380000000000 (64-bit, prefetchable) [size=256M]

$ python3 -c "import torch; print(torch.xpu.is_available(), torch.xpu.device_count())"

WARNING: Small BAR detected for device 0000:01:00.0
XPU device count is zero!
False 0
```

GPU compute is disabled with small BAR configuration.

---

## What I Already Tried

| Fix Attempted | Result |
|---------------|--------|
| Update GPU firmware in a bare-metal Windows machine (Windows driver 101.8306) | Freeze still occurred |
| Update BIOS (2207 → 2401) | Freeze still occurred (escalated from VM freeze to full host freeze) |
| Disable ASPM (via setpci) | Freeze still occurred |
| Enable GPU reset methods ("bus") | Immediate host crash when restarting VM |
| Kernel parameters (pcie_aspm=off, etc.) | Freeze still occurred |

---

## My Assumptions (I Could Be Wrong)

Based on what I observed:

1. The DMA mapping failure (`-22 EINVAL`) at address `0x380000000000` with size `0x800000000` might be related to the host's 42-bit IOMMU address width?

2. The host freeze might occur when something tries to access memory regions that weren't properly mapped due to the DMA mapping failure?

3. The kernel soft lockup call stack (`exc_page_fault` → `handle_mm_fault`) suggests a page fault handler is waiting for a PCIe/DMA transaction that never completes?

4. The "Small BAR detected" warning in the VM causes the compute runtime to disable XPU functionality?

---

## Questions

1. Is there a known workaround for using Arc Pro B60 with VFIO passthrough and ReBAR?

2. Are there any kernel parameters, QEMU options, or driver settings I might have missed?

3. Is there a way to enable compute functionality with small BAR configuration? I understand there might be performance tradeoffs.

4. Are my assumptions about the cause reasonable, or should I be looking at something else?

5. Should I be filing this somewhere else (xe driver GitLab, QEMU, etc.)?

---

## Environment Summary

```
Host:
  Proxmox VE 9.x, Kernel 6.17.4-1-pve
  ASUS PRIME Z890-P WIFI-CSM (BIOS 2401)
  Intel Core Ultra 7 265K, 96GB DDR5
  IOMMU: Host address width 42

GPU:
  Intel Arc Pro B60 24GB (8086:e211)
  Firmware: BMG_21.1174
  GOP: 23.0.1061

Guest VM:
  Ubuntu 24.04.3 LTS, Kernel 6.14.0-1008-intel
  Driver: xe
  PyTorch: 2.7.0+xpu
```
---

*Report date: 2026-01-07*  
*This report was prepared with assistance from Claude Code.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Intel Arc Pro B60 - Host Freeze with VFIO Passthrough and ReBAR Enabled #880

Intel Arc Pro B60 - VFIO Passthrough Issue and Small BAR Compute Question

Description

Environment

Proxmox Host

Guest VM

What I Observed

With ReBAR Enabled

With ReBAR Manually Disabled

What I Already Tried

My Assumptions (I Could Be Wrong)

Questions

Environment Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Details
GPU	Intel Arc Pro B60 24GB (PCI ID: 8086:e211)
GPU Firmware	BMG_21.1174
GOP (vBIOS)	23.0.1061
CPU	Intel Core Ultra 7 265K
Motherboard	ASUS PRIME Z890-P WIFI-CSM (BIOS 2401)
RAM	96GB DDR5
Host OS	Proxmox VE 9.x
Host Kernel	6.17.4-1-pve

Fix Attempted	Result
Update GPU firmware in a bare-metal Windows machine (Windows driver 101.8306)	Freeze still occurred
Update BIOS (2207 → 2401)	Freeze still occurred (escalated from VM freeze to full host freeze)
Disable ASPM (via setpci)	Freeze still occurred
Enable GPU reset methods ("bus")	Immediate host crash when restarting VM
Kernel parameters (pcie_aspm=off, etc.)	Freeze still occurred

Intel Arc Pro B60 - Host Freeze with VFIO Passthrough and ReBAR Enabled #880

Description

Intel Arc Pro B60 - VFIO Passthrough Issue and Small BAR Compute Question

Description

Environment

Proxmox Host

Guest VM

What I Observed

With ReBAR Enabled

With ReBAR Manually Disabled

What I Already Tried

My Assumptions (I Could Be Wrong)

Questions

Environment Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions