Skip to content

Conversation

@Wescoeur
Copy link
Contributor

A VDI can have the JRN_RELINK tag which triggers a parent change, but this can fail with LVM driver if a host has been rebooted, in this situation the parent is not active. Before this fix:

Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Coalescing *85ff84a8[VHD](65.000G///33.836G|n) -> *cf4a78da[VHD](65.000G///52.555G|n)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] ==> Coalesce apparently already done: skipping
Jan  2 16:01:16 xcp-host-1 SM: [3802739] lock: tried lock /var/lock/sm/1f74d512-a410-be6f-c816-8a1e43ea1801/sr, acquired: True (exists: True)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for 568e74bb[VHD](65.000G///65.133G|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Set relinking = True for 568e74bb[VHD](65.000G///65.133G|n)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for 568e74bb[VHD](65.000G///65.133G|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for dd9c344a[VHD](65.000G///8.000M|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Set relinking = True for dd9c344a[VHD](65.000G///8.000M|n)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for dd9c344a[VHD](65.000G///8.000M|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SM: [3802739] LVMCache: refreshing
Jan  2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper acquired
Jan  2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper sent '3802739 - 20917096.949649855-'
Jan  2 16:01:16 xcp-host-1 SM: [3802739] ['/sbin/lvs', '--noheadings', '--units', 'b', '-o', '+lv_tags', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801']
Jan  2 16:01:16 xcp-host-1 SM: [3802739]   pread SUCCESS
Jan  2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper released
Jan  2 16:01:16 xcp-host-1 SM: [3802739] ['/usr/bin/vhd-util', 'scan', '-f', '-m', 'VHD-*', '-l', 'VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801']
Jan  2 16:01:18 xcp-host-1 SM: [3802739]   pread SUCCESS
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] SR 1f74 ('sr001-clu001-tdeaz-az07-svc-data23') (538 VDIs in 68 VHD trees): no changes
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   Relinking 568e74bb[VHD](65.000G///65.133G|n) from *85ff84a8[VHD](65.000G///33.836G|n) to *cf4a78da[VHD](65.000G///52.555G|n)
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: opening lock file /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: acquired /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779
Jan  2 16:01:18 xcp-host-1 SM: [3802739] Refcount for lvm-1f74d512-a410-be6f-c816-8a1e43ea1801:568e74bb-3eab-4e15-974a-a34a4bf7d779 (0, 0) + (1, 0) => (1, 0)
Jan  2 16:01:18 xcp-host-1 SM: [3802739] Refcount for lvm-1f74d512-a410-be6f-c816-8a1e43ea1801:568e74bb-3eab-4e15-974a-a34a4bf7d779 set => (1, 0b)
Jan  2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper acquired
Jan  2 16:01:18 xcp-host-1 SM: [3802739] ['/sbin/lvchange', '-ay', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-568e74bb-3eab-4e15-974a-a34a4bf7d779']
Jan  2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper sent '3802739 - 20917099.141269115-'
Jan  2 16:01:18 xcp-host-1 SM: [3802739]   pread SUCCESS
Jan  2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper released
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: released /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779
Jan  2 16:01:18 xcp-host-1 SM: [3802739] ['/usr/bin/vhd-util', 'modify', '--debug', '-p', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-cf4a78da-316f-4fd9-a82d-07a47c142a39', '-n', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-568e74bb-3eab-4e15-974a-a34a4bf7d779']
Jan  2 16:01:18 xcp-host-1 SM: [3802739] FAILED in util.pread: (rc 2) stdout: 'failed to set parent to '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-cf4a78da-316f-4fd9-a82d-07a47c142a39': -2
Jan  2 16:01:18 xcp-host-1 SM: [3802739] ', stderr: ''
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: released /var/lock/sm/1f74d512-a410-be6f-c816-8a1e43ea1801/sr
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]          ***********************
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]          *  E X C E P T I O N  *
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]          ***********************
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] coalesce: EXCEPTION <class 'util.CommandException'>, No such file or directory
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 2024, in coalesce
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     self._coalesce(vdi)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 2228, in _coalesce
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     vdi._relinkSkip()
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 963, in _relinkSkip
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     child._setParent(self.parent)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 1396, in _setParent
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     vhdutil.setParent(self.path, parent.path, parent.raw)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/vhdutil.py", line 215, in setParent
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     ioretry(cmd)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/vhdutil.py", line 94, in ioretry
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     errlist=[errno.EIO, errno.EAGAIN])
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/util.py", line 347, in ioretry
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     return f()
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/vhdutil.py", line 93, in <lambda>
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     return util.ioretry(lambda: util.pread2(cmd, text=text),
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/util.py", line 255, in pread2
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     return pread(cmdlist, quiet=quiet, text=text)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/util.py", line 217, in pread
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     raise CommandException(rc, str(cmdlist), stderr.strip())
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] Coalesce failed, skipping

A VDI can have the `JRN_RELINK` tag which triggers a parent change,
but this can fail with LVM driver if a host has been rebooted,
in this situation the parent is not active. Before this fix:
```
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Coalescing *85ff84a8[VHD](65.000G///33.836G|n) -> *cf4a78da[VHD](65.000G///52.555G|n)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] ==> Coalesce apparently already done: skipping
Jan  2 16:01:16 xcp-host-1 SM: [3802739] lock: tried lock /var/lock/sm/1f74d512-a410-be6f-c816-8a1e43ea1801/sr, acquired: True (exists: True)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for 568e74bb[VHD](65.000G///65.133G|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Set relinking = True for 568e74bb[VHD](65.000G///65.133G|n)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for 568e74bb[VHD](65.000G///65.133G|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for dd9c344a[VHD](65.000G///8.000M|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Set relinking = True for dd9c344a[VHD](65.000G///8.000M|n)
Jan  2 16:01:16 xcp-host-1 SMGC: [3802739] Got sm-config for dd9c344a[VHD](65.000G///8.000M|n): {'relinking': 'True', 'import_task': 'OpaqueRef:28d03111-0fee-0a21-e659-a1e90a1f3a91', 'vdi_type': 'vhd', 'vhd-parent': '85ff84a8-153a-4fff-9c59-0fbab710d373'}
Jan  2 16:01:16 xcp-host-1 SM: [3802739] LVMCache: refreshing
Jan  2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper acquired
Jan  2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper sent '3802739 - 20917096.949649855-'
Jan  2 16:01:16 xcp-host-1 SM: [3802739] ['/sbin/lvs', '--noheadings', '--units', 'b', '-o', '+lv_tags', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801']
Jan  2 16:01:16 xcp-host-1 SM: [3802739]   pread SUCCESS
Jan  2 16:01:16 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper released
Jan  2 16:01:16 xcp-host-1 SM: [3802739] ['/usr/bin/vhd-util', 'scan', '-f', '-m', 'VHD-*', '-l', 'VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801']
Jan  2 16:01:18 xcp-host-1 SM: [3802739]   pread SUCCESS
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] SR 1f74 ('sr001-clu001-tdeaz-az07-svc-data23') (538 VDIs in 68 VHD trees): no changes
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   Relinking 568e74bb[VHD](65.000G///65.133G|n) from *85ff84a8[VHD](65.000G///33.836G|n) to *cf4a78da[VHD](65.000G///52.555G|n)
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: opening lock file /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: acquired /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779
Jan  2 16:01:18 xcp-host-1 SM: [3802739] Refcount for lvm-1f74d512-a410-be6f-c816-8a1e43ea1801:568e74bb-3eab-4e15-974a-a34a4bf7d779 (0, 0) + (1, 0) => (1, 0)
Jan  2 16:01:18 xcp-host-1 SM: [3802739] Refcount for lvm-1f74d512-a410-be6f-c816-8a1e43ea1801:568e74bb-3eab-4e15-974a-a34a4bf7d779 set => (1, 0b)
Jan  2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper acquired
Jan  2 16:01:18 xcp-host-1 SM: [3802739] ['/sbin/lvchange', '-ay', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-568e74bb-3eab-4e15-974a-a34a4bf7d779']
Jan  2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper sent '3802739 - 20917099.141269115-'
Jan  2 16:01:18 xcp-host-1 SM: [3802739]   pread SUCCESS
Jan  2 16:01:18 xcp-host-1 fairlock[3906]: /run/fairlock/devicemapper released
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: released /var/lock/sm/lvm-1f74d512-a410-be6f-c816-8a1e43ea1801/568e74bb-3eab-4e15-974a-a34a4bf7d779
Jan  2 16:01:18 xcp-host-1 SM: [3802739] ['/usr/bin/vhd-util', 'modify', '--debug', '-p', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-cf4a78da-316f-4fd9-a82d-07a47c142a39', '-n', '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-568e74bb-3eab-4e15-974a-a34a4bf7d779']
Jan  2 16:01:18 xcp-host-1 SM: [3802739] FAILED in util.pread: (rc 2) stdout: 'failed to set parent to '/dev/VG_XenStorage-1f74d512-a410-be6f-c816-8a1e43ea1801/VHD-cf4a78da-316f-4fd9-a82d-07a47c142a39': -2
Jan  2 16:01:18 xcp-host-1 SM: [3802739] ', stderr: ''
Jan  2 16:01:18 xcp-host-1 SM: [3802739] lock: released /var/lock/sm/1f74d512-a410-be6f-c816-8a1e43ea1801/sr
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]          ***********************
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]          *  E X C E P T I O N  *
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]          ***********************
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] coalesce: EXCEPTION <class 'util.CommandException'>, No such file or directory
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 2024, in coalesce
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     self._coalesce(vdi)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 2228, in _coalesce
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     vdi._relinkSkip()
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 963, in _relinkSkip
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     child._setParent(self.parent)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/cleanup.py", line 1396, in _setParent
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     vhdutil.setParent(self.path, parent.path, parent.raw)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/vhdutil.py", line 215, in setParent
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     ioretry(cmd)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/vhdutil.py", line 94, in ioretry
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     errlist=[errno.EIO, errno.EAGAIN])
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/util.py", line 347, in ioretry
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     return f()
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/vhdutil.py", line 93, in <lambda>
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     return util.ioretry(lambda: util.pread2(cmd, text=text),
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/util.py", line 255, in pread2
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     return pread(cmdlist, quiet=quiet, text=text)
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]   File "/opt/xensource/sm/util.py", line 217, in pread
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]     raise CommandException(rc, str(cmdlist), stderr.strip())
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739]
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
Jan  2 16:01:18 xcp-host-1 SMGC: [3802739] Coalesce failed, skipping
```

Signed-off-by: Ronan Abhamon <ronan.abhamon@vates.tech>
@MarkSymsCtx MarkSymsCtx merged commit 5b01f83 into xapi-project:master Jan 15, 2026
2 checks passed
@Wescoeur Wescoeur deleted the ran-fix-relink-reboot-upstream branch January 15, 2026 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants