STP for Live Update NetworkAttachmentDefinition Reference (VEP 140)#39
STP for Live Update NetworkAttachmentDefinition Reference (VEP 140)#39yossisegev wants to merge 1 commit intoRedHatQE:mainfrom
Conversation
WalkthroughA new test plan document for Live Update NetworkAttachmentDefinition Reference (Hotpluggable NAD Ref) feature was added, detailing QE strategy, test scenarios, and requirements traceability for testing live migration of VMs with updated NAD references. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@stps/sig-network/hotpluggable-nad-ref-stp.md`:
- Line 32: Fix the typos across the document by replacing the incorrect tokens
with the corrected spellings: change "NAd reference" to "NAD reference",
"dependecny" -> "dependency", "implmenetation" -> "implementation", "backgrond"
-> "background", "letncy cannot be measured,for example" -> "latency cannot be
measured, for example", "evetually" -> "eventually", "is mot affected" -> "is
not affected", "dependancy" -> "dependency", "migartion" -> "migration",
"refelcted" -> "reflected", "environemnt" -> "environment", and "migratin" ->
"migration"; ensure you update each occurrence of the quoted phrases (e.g., the
header/text containing "NAd reference", the paragraphs with
"dependecny"/"implmenetation", the sentence with "letncy...", etc.) so the
document reads correctly and consistently.
- Around line 156-162: Update the traceability table to add a Requirement row
for the "P1 feature-gate disabled behavior" mapping the in-scope item "Verify
behavior when LiveUpdateNADRef feature-gate is disabled (attempting to change
NAD reference produces failure/error)" to a Test Scenario that attempts a
networkName update with the feature-gate off and asserts failure/error, and add
another Requirement row for "MAC preservation" mapping the in-scope item
"Verifying that guest interface identity (e.g. MAC) is preserved where
applicable" to a Test Scenario that changes NAD reference and verifies the guest
MAC is unchanged; also update the existing VEP-140 (NFR) regression row to
explicitly mention "interface hot-plug regression" in the Requirement
Summary/Test Scenario to reflect the specific regression concern named elsewhere
in the document.
|
|
||
| | Check | Done | Details/Notes | Comments | | ||
| |:---------------------------------------|:-----|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------| | ||
| | **Review Requirements** | [V] | VEP #140: Change NAD reference on running VM via migration; bridge binding; no RestartRequired when only networkName (i.e. the NAd reference) changes; migration triggered when networkName changes. | | |
There was a problem hiding this comment.
Fix spelling/typo errors across the document.
Several typos are present. Line 91 is the most impactful — "is mot affected" inverts the intended meaning.
| Line | Current | Corrected |
|---|---|---|
| 32 | NAd reference |
NAD reference |
| 47 | dependecny, implmenetation |
dependency, implementation |
| 83 | backgrond |
background |
| 85 | letncy cannot be measured,for example |
latency cannot be measured, for example |
| 87 | evetually |
eventually |
| 91 | is mot affected |
is not affected |
| 108 | dependancy |
dependency |
| 137 | migartion, refelcted |
migration, reflected |
| 138 | environemnt |
environment |
| 149 | migratin |
migration |
Also applies to: 47-47, 83-83, 85-85, 87-87, 91-91, 108-108, 137-137, 138-138, 149-149
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@stps/sig-network/hotpluggable-nad-ref-stp.md` at line 32, Fix the typos
across the document by replacing the incorrect tokens with the corrected
spellings: change "NAd reference" to "NAD reference", "dependecny" ->
"dependency", "implmenetation" -> "implementation", "backgrond" -> "background",
"letncy cannot be measured,for example" -> "latency cannot be measured, for
example", "evetually" -> "eventually", "is mot affected" -> "is not affected",
"dependancy" -> "dependency", "migartion" -> "migration", "refelcted" ->
"reflected", "environemnt" -> "environment", and "migratin" -> "migration";
ensure you update each occurrence of the quoted phrases (e.g., the header/text
containing "NAd reference", the paragraphs with "dependecny"/"implmenetation",
the sentence with "letncy...", etc.) so the document reads correctly and
consistently.
| | Requirement ID | Requirement Summary | Test Scenario(s) | Tier | Priority | | ||
| |:-----------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|:-------|:---------| | ||
| | VEP-140 (Goal) | Change NAD reference on running VM (bridge) via migration; VM on new network and has connectivity | Change networkName on running VM → wait for migration → verify VM attached to new NAD and connectivity | Tier 2 | P0 | | ||
| | VEP-140 (Design) | Migration triggered and completes after networkName update | Update networkName → verify migration requested and completes | Tier 1 | P0 | | ||
| | VEP-140 (Design) | RestartRequired not set when only networkName changes | Update only networkName → assert RestartRequired not added | Tier 1 | P1 | | ||
| | VEP-140 (NFR) | No regression in secondary network and migration | Existing secondary network and migration tests pass | Tier 2 | P1 | | ||
|
|
There was a problem hiding this comment.
Traceability table is missing rows for the P1 feature-gate-disabled goal and the MAC preservation in-scope item.
Two items explicitly listed as in-scope or as testing goals have no corresponding traceability row:
- P1 feature-gate disabled behavior (line 63): "Verify behavior when LiveUpdateNADRef feature-gate is disabled (e.g. attempting to change NAD reference produces failure/error)" — there is no row in the table for this scenario.
- MAC preservation (line 58): "Verifying that guest interface identity (e.g. MAC) is preserved where applicable" — this is listed as in-scope but is not mapped to any test scenario or requirement row.
Additionally, the VEP-140 (NFR) regression row (line 161) does not explicitly call out interface hot-plug regression, which is the specific regression concern named on line 64.
📋 Suggested additions to the traceability table
| VEP-140 (NFR) | No regression in secondary network and migration | Existing secondary network and migration tests pass | Tier 2 | P1 |
+| | No regression in interface hot-plug flows | Existing interface hot-plug E2E tests pass unchanged | Tier 2 | P1 |
+| VEP-140 (Design) | LiveUpdateNADRef feature-gate disabled: NAD ref change is rejected | Disable LiveUpdateNADRef gate → update networkName → assert error/rejection returned | Tier 1 | P1 |
+| VEP-140 (Design) | Guest interface identity (MAC) preserved after NAD ref update and migration | Update networkName → migration completes → verify MAC unchanged on guest interface | Tier 2 | P1 |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | Requirement ID | Requirement Summary | Test Scenario(s) | Tier | Priority | | |
| |:-----------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|:-------|:---------| | |
| | VEP-140 (Goal) | Change NAD reference on running VM (bridge) via migration; VM on new network and has connectivity | Change networkName on running VM → wait for migration → verify VM attached to new NAD and connectivity | Tier 2 | P0 | | |
| | VEP-140 (Design) | Migration triggered and completes after networkName update | Update networkName → verify migration requested and completes | Tier 1 | P0 | | |
| | VEP-140 (Design) | RestartRequired not set when only networkName changes | Update only networkName → assert RestartRequired not added | Tier 1 | P1 | | |
| | VEP-140 (NFR) | No regression in secondary network and migration | Existing secondary network and migration tests pass | Tier 2 | P1 | | |
| | Requirement ID | Requirement Summary | Test Scenario(s) | Tier | Priority | | |
| |:-----------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|:-------|:---------| | |
| | VEP-140 (Goal) | Change NAD reference on running VM (bridge) via migration; VM on new network and has connectivity | Change networkName on running VM → wait for migration → verify VM attached to new NAD and connectivity | Tier 2 | P0 | | |
| | VEP-140 (Design) | Migration triggered and completes after networkName update | Update networkName → verify migration requested and completes | Tier 1 | P0 | | |
| | VEP-140 (Design) | RestartRequired not set when only networkName changes | Update only networkName → assert RestartRequired not added | Tier 1 | P1 | | |
| | VEP-140 (NFR) | No regression in secondary network and migration | Existing secondary network and migration tests pass | Tier 2 | P1 | | |
| | | No regression in interface hot-plug flows | Existing interface hot-plug E2E tests pass unchanged | Tier 2 | P1 | | |
| | VEP-140 (Design) | LiveUpdateNADRef feature-gate disabled: NAD ref change is rejected | Disable LiveUpdateNADRef gate → update networkName → assert error/rejection returned | Tier 1 | P1 | | |
| | VEP-140 (Design) | Guest interface identity (MAC) preserved after NAD ref update and migration | Update networkName → migration completes → verify MAC unchanged on guest interface | Tier 2 | P1 | |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@stps/sig-network/hotpluggable-nad-ref-stp.md` around lines 156 - 162, Update
the traceability table to add a Requirement row for the "P1 feature-gate
disabled behavior" mapping the in-scope item "Verify behavior when
LiveUpdateNADRef feature-gate is disabled (attempting to change NAD reference
produces failure/error)" to a Test Scenario that attempts a networkName update
with the feature-gate off and asserts failure/error, and add another Requirement
row for "MAC preservation" mapping the in-scope item "Verifying that guest
interface identity (e.g. MAC) is preserved where applicable" to a Test Scenario
that changes NAD reference and verifies the guest MAC is unchanged; also update
the existing VEP-140 (NFR) regression row to explicitly mention "interface
hot-plug regression" in the Requirement Summary/Test Scenario to reflect the
specific regression concern named elsewhere in the document.
frenzyfriday
left a comment
There was a problem hiding this comment.
Thanks for the STP! Lgtm in general, just a few questions/comments
|
|
||
| ### **Feature Overview** | ||
|
|
||
| This feature enables VM owners to change the NetworkAttachmentDefinition reference (`networkName`) of a secondary network on a **running** VM without rebooting. When the user updates `spec.networks[].multus.networkName` in the VM spec, the system triggers a Live Migration so the VM’s pod is re-plumbed to the new network (e.g., a different VLAN). Today, changing the NAD reference requires a VM restart. The enhancement uses the existing migration path: no reboot, and guest interface properties (e.g., MAC) stay the same. A new feature gate **LiveUpdateNADRef** controls the behavior. Supported scenario in scope is **bridge binding** on live-migratable VMs. |
There was a problem hiding this comment.
VM’s pod is re-plumbed to the new network
Technically there is no re-plumbing as the VM will be live migrated and its source pod with the old plumbing will be destroyed and a new pod with new plumbing will be created. But then again from a user POV it probably is replumbing.. so I am not sure if it is safe to say that the pod will be replumbed.
|
|
||
| | Check | Done | Details/Notes | Comments | | ||
| |:---------------------------------------|:-----|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------| | ||
| | **Review Requirements** | [V] | VEP #140: Change NAD reference on running VM via migration; bridge binding; no RestartRequired when only networkName (i.e. the NAd reference) changes; migration triggered when networkName changes. | | |
There was a problem hiding this comment.
[nit]
the NAd reference
the NAD reference
| | Check | Done | Details/Notes | Comments | | ||
| |:---------------------------------------|:-----|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------| | ||
| | **Review Requirements** | [V] | VEP #140: Change NAD reference on running VM via migration; bridge binding; no RestartRequired when only networkName (i.e. the NAd reference) changes; migration triggered when networkName changes. | | | ||
| | **Understand Value** | [V] | VM owners can re-assign new networks (e.g. a different VLAN) to VMs without reboot, avoiding workload impact and preserving guest interface identity (e.g. MAC). | | |
There was a problem hiding this comment.
VM owners can re-assign new networks
new networks but of the same binding type (but maybe that is already clear)
| | Check | Done | Details/Notes | Comments | | ||
| |:---------------------------------|:-----|:------------------------------------------------------------------------------------------------------------------------------------------|:---------| | ||
| | **Developer Handoff/QE Kickoff** | [V] | Meeting where Dev introduced QE through NAD change in VM, migration condition handling, and DNC compatibility (Option 1 vs 2). | | | ||
| | **Technology Challenges** | [V] | Requires live-migratable VM and bridge binding; multus secondary networks. | | |
There was a problem hiding this comment.
Does it also required inter node connectivity or something like that? I mention this because while writing the T1 tests just to avoid setting up inter node connectivity we were creating the client VM (that will be also on the new NAD and will ping our migrating VM from) on the same node that the vm under test migrated to. Not sure if this applies here
| |:---------------------------------|:-----|:------------------------------------------------------------------------------------------------------------------------------------------|:---------| | ||
| | **Developer Handoff/QE Kickoff** | [V] | Meeting where Dev introduced QE through NAD change in VM, migration condition handling, and DNC compatibility (Option 1 vs 2). | | | ||
| | **Technology Challenges** | [V] | Requires live-migratable VM and bridge binding; multus secondary networks. | | | ||
| | **Test Environment Needs** | [V] | Cluster nodes with secondary interfaces, bridge binding, live migration enabled, NMState operator; LiveUpdateNADRef feature-gate enabled. | | |
There was a problem hiding this comment.
Is it worth mentioning that it needs to be run on a cluster with >=2 nodes (because it will be migrating)? I mention this because we added a similar decorator in the T1 tests, not sure that is applicable here though
|
|
||
| **Testing Goals:** | ||
| - **[P0]** Change NAD reference on a running VM (bridge binding) and verify the VM is connected to the new network and has connectivity (E2E). | ||
| - **[P1]** Verify behavior when LiveUpdateNADRef feature-gate is disabled (e.g. attempting to change NAD reference produces failure/error). |
There was a problem hiding this comment.
With FG off it just adds a restartRequired condition to the VM I think
| | Seamless network connectivity until action is completed | Not guaranteed by design | [ ] | | ||
| | Changing NAD reference on non-migratable VMs | Not supported | [ ] | | ||
| | Changing guest network configuration | Not in scope; guest may need separate reconfig | [ ] | | ||
| | In-place NAD swap with Dynamic Networks Controller | User's scenarios are oblivious to the way DNC is interacted | [ ] | |
There was a problem hiding this comment.
User's scenarios are oblivious to the way DNC is interacted
Yes, but as per the current implementation In-place NAD swap (without migration) will not happen both with/without DNC
| |:-------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------| | ||
| | Functional Testing | Validates that the feature works according to specified requirements and user stories | Y | Core: NAD ref update → (migration in backgrond) → new network connectivity | | ||
| | Automation Testing | Ensures test cases are automated for continuous integration and regression coverage | Y | E2E scenarios must be automated for GA | | ||
| | Performance Testing | Validates feature performance meets requirements (latency, throughput, resource usage) | N | Uses existing migration performance; no new SLA in VEP; letncy cannot be measured,for example if NAD is changed from disabled network | |
| | Regression Testing | Verifies that new changes do not break existing functionality | Y | Connectivity post NAD-change | | ||
| | Upgrade Testing | Validates upgrade paths from previous versions, data migration, and configuration preservation | Y | Expected to function on upgraded clusters after feature-gate enablement | | ||
| | Backward Compatibility Testing | Ensures feature maintains compatibility with previous API versions and configurations | Y | Verify the existing interface hot-plug feature is mot affected | | ||
| | Dependencies | Dependent on deliverables from other components/products? | Y | KubeVirt virt-controller; Multus; optional DNC | |
There was a problem hiding this comment.
There is a dependency on Multus? Do you mean we want some change from them or just that it is using Multus (and so it is dependent on Multus not breaking stuff)?
STP Metadata
VEP issue: https://github.com/kubevirt/enhancements/blob/main/veps/sig-network/hotpluggable-nad-ref.md
What this PR does
STP for covering updating network (NAD) reference on live VMs.
Special notes for your reviewer
Summary by CodeRabbit