net: Introduce vmi ip persistence test#3930
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds optional IPv6 DHCP field to cloud-init EthernetDevice, and introduces fixtures, helper utilities, and tests to validate IP persistence for VMs attached to a Linux bridge, including verification before and after migration. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/network/vmi/conftest.py`:
- Line 1: Replace the deprecated typing.Generator import with the modern one
from collections.abc: update the import that currently references
typing.Generator to import Generator from collections.abc, and ensure any type
annotations that use typing.Generator (e.g., in fixtures or functions in this
module) now refer to collections.abc.Generator so the code is compatible with
Python 3.12+.
- Around line 17-19: The fixture ip_persistence_bridge_device_name incorrectly
uses yield without teardown; change it to return the string instead of yielding
and update its type annotation from Generator[str, None, None] to just str
(i.e., def ip_persistence_bridge_device_name(index_number: Generator[int, None,
None]) -> str: return f"br{next(index_number)}test"). Also remove any now-unused
typing Generator import or adjust the index_number parameter type if you prefer,
ensuring no teardown is implied.
- Around line 66-74: Remove the unnecessary defensive check around
spec.template.spec.domain.devices added after calling base_vmspec();
base_vmspec() always initializes devices=Devices(), so delete the if statement
and assign spec.template.spec.domain.devices.interfaces unconditionally to the
list of Interface(...) values, then set spec.template.spec.networks to the
Network(...) list (using Multus(networkName=bridge_network.name)) — update the
block that references base_vmspec(),
spec.template.spec.domain.devices.interfaces, Interface, Network, Multus, and
bridge_network accordingly.
In `@tests/network/vmi/libippersistence.py`:
- Around line 14-17: The function vm_cloud_init_data currently accepts two
boolean positional parameters (ipv4_supported_cluster, ipv6_supported_cluster);
make them keyword-only by adding a bare * in the parameter list (i.e., change
the signature of vm_cloud_init_data to enforce keyword-only args) so callers
must use named arguments and avoid ambiguous positional calls like
vm_cloud_init_data(True, False).
- Line 46: The log message hardcodes "5-minute" and can drift from the single
source of truth TIMEOUT_SECONDS; change the message in the LOGGER.info call to
compute the human-readable duration from TIMEOUT_SECONDS (e.g., derive minutes
using integer division or rounding) and interpolate that computed value instead
of the literal "5-minute" while keeping vmi_name and context_msg unchanged.
- Around line 41-43: The parameter name on monitor_vmi_interfaces is misleading:
rename the parameter from vmi to vm to match its BaseVirtualMachine type, update
all internal references (replace vmi.vmi.name and vmi.namespace with vm.vmi.name
and vm.namespace), and adjust any local references in monitor_vmi_interfaces
accordingly; also update call sites that pass the argument by position or name
(e.g., test_ip_persistence.py where callers use vmi=...) to use the new
parameter name or continue using the named arg as appropriate.
- Around line 51-70: The watch iterator created via watcher = watch.Watch() and
started by vmi_resource.watch(...) can leak the underlying HTTP stream if you
break early; wrap the for loop in a try/finally and call watcher.stop() in the
finally block to ensure the watch is stopped deterministically when the timeout
branch (time.time() - start_time >= TIMEOUT_SECONDS) triggers or any other early
exit occurs, leaving the rest of the loop logic (event_type, vmi_obj, iteration
increment and logging) unchanged.
In `@tests/network/vmi/test_ip_persistence.py`:
- Around line 11-21: Add the tier3 pytest marker to both tests by decorating
test_ip_address_persistence and test_ip_address_persistence_after_migration with
`@pytest.mark.tier3` (in addition to the existing `@pytest.mark.polarion`
decorators) so the functions monitor_vmi_interfaces and migrate_vm_and_verify
runs are classified as tier3; ensure the decorator appears immediately above
each def to match existing style.
| @pytest.mark.polarion("CNV-14339") | ||
| def test_ip_address_persistence(vm_single_nic_with_pod): | ||
| monitor_vmi_interfaces(vmi=vm_single_nic_with_pod) | ||
|
|
||
|
|
||
| @pytest.mark.polarion("CNV-14340") | ||
| def test_ip_address_persistence_after_migration(vm_single_nic_with_pod): | ||
| migrate_vm_and_verify(vm=vm_single_nic_with_pod) | ||
| LOGGER.info(f"VM {vm_single_nic_with_pod.vmi.name} migrated to node {vm_single_nic_with_pod.vmi.node.name}") | ||
|
|
||
| monitor_vmi_interfaces(vmi=vm_single_nic_with_pod, context="after migration") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check pytest.ini, pyproject.toml, setup.cfg for marker registration
echo "=== Checking pytest.ini ==="
if [ -f pytest.ini ]; then
cat pytest.ini
else
echo "pytest.ini not found"
fi
echo -e "\n=== Checking pyproject.toml (markers section) ==="
if [ -f pyproject.toml ]; then
grep -A 20 "\[tool.pytest" pyproject.toml || echo "No pytest tool section"
else
echo "pyproject.toml not found"
fi
echo -e "\n=== Checking setup.cfg ==="
if [ -f setup.cfg ]; then
cat setup.cfg
else
echo "setup.cfg not found"
fiRepository: RedHatQE/openshift-virtualization-tests
Length of output: 4797
🏁 Script executed:
#!/bin/bash
# Verify the actual test file content
echo "=== Current state of tests/network/vmi/test_ip_persistence.py ==="
cat -n tests/network/vmi/test_ip_persistence.py | head -30Repository: RedHatQE/openshift-virtualization-tests
Length of output: 991
🏁 Script executed:
#!/bin/bash
# Count tier3 usage across the repository to confirm prevalence
echo "=== Tier3 marker usage count ==="
rg "pytest\.mark\.tier3" --type=py | wc -l
echo -e "\n=== Example tier3 usage (first 5) ==="
rg "pytest\.mark\.tier3" --type=py -B1 | head -20
echo -e "\n=== Check if vm_single_nic_with_pod fixture exists and what it does ==="
rg "def vm_single_nic_with_pod" --type=py -A5Repository: RedHatQE/openshift-virtualization-tests
Length of output: 1550
🏁 Script executed:
#!/bin/bash
# Examine monitor_vmi_interfaces to understand test complexity
echo "=== monitor_vmi_interfaces implementation ==="
cat -n tests/network/vmi/libippersistence.py | head -50
echo -e "\n=== Check for timeout/sleep behavior in monitor function ==="
rg "timeout|sleep|sampler" tests/network/vmi/libippersistence.py -C3Repository: RedHatQE/openshift-virtualization-tests
Length of output: 2490
🏁 Script executed:
#!/bin/bash
# Check migrate_vm_and_verify function
echo "=== migrate_vm_and_verify implementation ==="
rg "def migrate_vm_and_verify" utilities/virt.py -A 15
echo -e "\n=== Check timeout in migrate_vm_and_verify ==="
rg "timeout|migrate" utilities/virt.py | grep -i "migrate_vm_and_verify" -A 5 | head -20Repository: RedHatQE/openshift-virtualization-tests
Length of output: 1149
🏁 Script executed:
#!/bin/bash
# Examine vm_single_nic_with_pod fixture setup
echo "=== vm_single_nic_with_pod fixture ==="
cat -n tests/network/vmi/conftest.py | sed -n '1,100p'Repository: RedHatQE/openshift-virtualization-tests
Length of output: 4115
Add required tier3 markers to both tests.
These tests qualify as tier3: they provision bridge devices and a Fedora VM with cloud-init, wait for guest agent readiness and IP configuration (via lookup_iface_status), then run a 5-minute monitoring loop. One test additionally triggers a live migration (default 12-minute timeout). This combination of complexity, hardware-specific setup, and long runtime (5–17 minutes) makes them unambiguously tier3-warranted per coding guidelines.
Proposed fix
+@pytest.mark.tier3
`@pytest.mark.polarion`("CNV-14339")
def test_ip_address_persistence(vm_single_nic_with_pod):
monitor_vmi_interfaces(vmi=vm_single_nic_with_pod)
+@pytest.mark.tier3
`@pytest.mark.polarion`("CNV-14340")
def test_ip_address_persistence_after_migration(vm_single_nic_with_pod):
migrate_vm_and_verify(vm=vm_single_nic_with_pod)🧰 Tools
🪛 Ruff (0.15.1)
[warning] 19-19: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/network/vmi/test_ip_persistence.py` around lines 11 - 21, Add the tier3
pytest marker to both tests by decorating test_ip_address_persistence and
test_ip_address_persistence_after_migration with `@pytest.mark.tier3` (in addition
to the existing `@pytest.mark.polarion` decorators) so the functions
monitor_vmi_interfaces and migrate_vm_and_verify runs are classified as tier3;
ensure the decorator appears immediately above each def to match existing style.
|
D/S test |
5a90b5b to
71b60a6
Compare
|
|
||
|
|
||
| @pytest.fixture(scope="module") | ||
| def bridge_devices( |
There was a problem hiding this comment.
nit
usually we call it nncp, not devices. But if other folks are ok with it, I am ok as well. It is related to nad resource as well.
There was a problem hiding this comment.
+1
I also thought about it in the first draft, when network_device was used before it was replaced with the libnncp.NodeNetworkConfigurationPolicy, but now it is even more blunt.
You created an NNCP object, so this is what you have to yield, and not a name that implies something else (the bridge device itself).
There was a problem hiding this comment.
And now I see you already fixed it.
Thank you.
| ] | ||
| ), | ||
| node_selector={WORKER_NODE_LABEL_KEY: ""}, | ||
| ) as br: |
There was a problem hiding this comment.
nit
in fact this is nncp object, not br, can be combined as nncp_br, for example. I think it is good to align it with our fresh recent usages.
| ) as vm: | ||
| vm.start(wait=True) | ||
| vm.wait_for_agent_connected() | ||
| lookup_iface_status( |
There was a problem hiding this comment.
again this. There is still no explanation if we should test primary, secondary, both. The scenario that we have to test primary and secondary, right? I just want to understand it finally, the description is still unclear.
|
|
||
|
|
||
| @pytest.fixture(scope="module") | ||
| def linux_bridge_vm_for_ip_persist( |
There was a problem hiding this comment.
nit
as I mentioned already here, I am not in favor of this _for_ip_persist. But I don't want to fight a lot. If all others are ok, let it be.
There was a problem hiding this comment.
Okay, I'll keep it then
| @pytest.mark.polarion("CNV-14340") | ||
| def test_ip_address_persistence_after_migration(linux_bridge_vm_for_ip_persist): | ||
| migrate_vm_and_verify(vm=linux_bridge_vm_for_ip_persist) | ||
| LOGGER.info( |
There was a problem hiding this comment.
again this. Moreover, I don't figure why this log is needed. You have explicit migration action that will be presented in the log, why are you adding this log message?
There was a problem hiding this comment.
I dropped the logger here
| return cloudinit.EthernetDevice(addresses=addresses) | ||
|
|
||
|
|
||
| def ip_persist_vm( |
There was a problem hiding this comment.
what does it mean ip_persist_vm? what state this naming is describing? how ip_persist_vm is different from just a VM with l2 bridge secondary iface? BTW, this VM fabric can be moved in network/libs/vm_factory.py and used then in other places (as a follow up).
There was a problem hiding this comment.
Renamed and will add to a follow-up
| from tests.network.libs.ip import random_ipv4_address, random_ipv6_address | ||
|
|
||
| LOGGER = logging.getLogger(__name__) | ||
| WATCH_TEST_TIMEOUT_SECONDS = 300 |
There was a problem hiding this comment.
_TEST_ should be removed, as for me.
| interfaces = vmi.instance.status.interfaces | ||
|
|
||
| for event in vmi.watcher(timeout=WATCH_TEST_TIMEOUT_SECONDS): | ||
| if time.time() - start_time >= WATCH_TEST_TIMEOUT_SECONDS: | ||
| LOGGER.info( | ||
| f"VMI {vmi.name} maintained stable ipAddress fields for" | ||
| f" {WATCH_TEST_TIMEOUT_SECONDS} seconds{context_msg}" | ||
| ) | ||
| break | ||
|
|
||
| if event["type"] != "MODIFIED": | ||
| continue | ||
|
|
||
| interfaces = event["object"].status.interfaces | ||
| LOGGER.info(f"Event: VMI {vmi.name} status updated{context_msg}, interfaces: {interfaces}") | ||
|
|
||
| yield interfaces |
tests/network/non_functional/ip_persistence/test_ip_persistence.py
Outdated
Show resolved
Hide resolved
Mulitple users have encountered events in which vmi interfaces or ipAddress field disappeared. On setup we wait for all VM interfaces to stabilize before verifying IP persistence. During tests, using watch kuberenetes API, which is used by wrapper's watcher, we monitor for such events and verify all interfaces are persistent. Signed-off-by: Asia Khromov <azhivovk@redhat.com>
5e1a1b7 to
6a6cddb
Compare
|
|
|
||
|
|
||
| @pytest.fixture(scope="module") | ||
| def bridge_devices( |
There was a problem hiding this comment.
+1
I also thought about it in the first draft, when network_device was used before it was replaced with the libnncp.NodeNetworkConfigurationPolicy, but now it is even more blunt.
You created an NNCP object, so this is what you have to yield, and not a name that implies something else (the bridge device itself).
Mulitple users have encountered events in which
vmi interfaces or ipAddress field disappeared.
VMI interface status is unstable during boot - interfaces may
temporarily disappear from the status until all are fully up.
Asserting before interfaces stabilize risks a false negative - during setup
we wait for all VM interfaces to stabilize before verifying IP persistence.
During tests, using watch kuberenetes API, which
is used by wrapper's watcher, we monitor for such
events and verify all interfaces are persistent.
jira-ticket: https://issues.redhat.com/browse/CNV-79267
Summary by CodeRabbit