Skip to content

Comments

net: Introduce vmi ip persistence test#3930

Open
azhivovk wants to merge 1 commit intoRedHatQE:mainfrom
azhivovk:test_ip_flapping
Open

net: Introduce vmi ip persistence test#3930
azhivovk wants to merge 1 commit intoRedHatQE:mainfrom
azhivovk:test_ip_flapping

Conversation

@azhivovk
Copy link
Contributor

@azhivovk azhivovk commented Feb 19, 2026

Mulitple users have encountered events in which
vmi interfaces or ipAddress field disappeared.

VMI interface status is unstable during boot - interfaces may
temporarily disappear from the status until all are fully up.
Asserting before interfaces stabilize risks a false negative - during setup
we wait for all VM interfaces to stabilize before verifying IP persistence.

During tests, using watch kuberenetes API, which
is used by wrapper's watcher, we monitor for such
events and verify all interfaces are persistent.

jira-ticket: https://issues.redhat.com/browse/CNV-79267

Summary by CodeRabbit

  • Tests
    • Added optional IPv6 DHCP support in network interface test data.
    • Added IP persistence validation for Linux-bridge network setups.
    • Added tests ensuring VM network interfaces retain IP addresses after migration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 19, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds optional IPv6 DHCP field to cloud-init EthernetDevice, and introduces fixtures, helper utilities, and tests to validate IP persistence for VMs attached to a Linux bridge, including verification before and after migration.

Changes

Cohort / File(s) Summary
Cloud-init dataclass
tests/network/libs/cloudinit.py
Adds optional dhcp6: bool | None = None field to EthernetDevice dataclass to represent IPv6 DHCP configuration.
IP persistence fixtures
tests/network/non_functional/ip_persistence/conftest.py
Adds module-scoped fixtures: bridge_devices (creates NodeNetworkConfigurationPolicy for Linux bridge), bridge_network (NetworkAttachmentDefinition for Multus bridge), and linux_bridge_vm_for_ip_persist (provisions VM attached to bridge with IPv4/IPv6 flags).
IP persistence helpers
tests/network/non_functional/ip_persistence/libippersistence.py
New helper module: cloud-init builders (primary_iface_cloud_init, secondary_iface_cloud_init), VM builder ip_persist_vm producing a Fedora VM with two interfaces/networks and cloud-init disk, and monitor_interfaces context manager to track VMI interface status changes (300s timeout).
Integration tests
tests/network/non_functional/ip_persistence/test_ip_persistence.py
Adds tests test_ip_address_persistence and test_ip_address_persistence_after_migration to assert two interfaces have IPs and that addresses persist after migration; includes module-level logger.
Project metadata
manifest_file, requirements.txt, pyproject.toml
Minor metadata lines changed (single-line updates reported).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description lacks several required template sections and is vague about scope. Missing sections: Short description, Why we need it, Which issue(s) this PR fixes (partially addressed), and Special notes for reviewer. Complete the missing template sections. Add a concise short description, explain the problem and solution rationale, clarify scope boundaries, and add relevant reviewer notes about the 94+ lines of test fixtures and monitoring logic.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'net: Introduce vmi ip persistence test' accurately summarizes the main change—adding new IP persistence tests for VMI interfaces.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-virtualization-qe-bot-4

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest verify-bugs-are-open - verify-bugs-are-open
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No WIP, hold, conflict labels
  5. Verified: PR must be marked as verified (if verification is enabled)

📊 Review Process

Approvers and Reviewers

Approvers:

  • EdDev

Reviewers:

  • Anatw
  • EdDev
  • azhivovk
  • servolkov
  • yossisegev
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is automatically removed on each new commit
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/network/vmi/conftest.py`:
- Line 1: Replace the deprecated typing.Generator import with the modern one
from collections.abc: update the import that currently references
typing.Generator to import Generator from collections.abc, and ensure any type
annotations that use typing.Generator (e.g., in fixtures or functions in this
module) now refer to collections.abc.Generator so the code is compatible with
Python 3.12+.
- Around line 17-19: The fixture ip_persistence_bridge_device_name incorrectly
uses yield without teardown; change it to return the string instead of yielding
and update its type annotation from Generator[str, None, None] to just str
(i.e., def ip_persistence_bridge_device_name(index_number: Generator[int, None,
None]) -> str: return f"br{next(index_number)}test"). Also remove any now-unused
typing Generator import or adjust the index_number parameter type if you prefer,
ensuring no teardown is implied.
- Around line 66-74: Remove the unnecessary defensive check around
spec.template.spec.domain.devices added after calling base_vmspec();
base_vmspec() always initializes devices=Devices(), so delete the if statement
and assign spec.template.spec.domain.devices.interfaces unconditionally to the
list of Interface(...) values, then set spec.template.spec.networks to the
Network(...) list (using Multus(networkName=bridge_network.name)) — update the
block that references base_vmspec(),
spec.template.spec.domain.devices.interfaces, Interface, Network, Multus, and
bridge_network accordingly.

In `@tests/network/vmi/libippersistence.py`:
- Around line 14-17: The function vm_cloud_init_data currently accepts two
boolean positional parameters (ipv4_supported_cluster, ipv6_supported_cluster);
make them keyword-only by adding a bare * in the parameter list (i.e., change
the signature of vm_cloud_init_data to enforce keyword-only args) so callers
must use named arguments and avoid ambiguous positional calls like
vm_cloud_init_data(True, False).
- Line 46: The log message hardcodes "5-minute" and can drift from the single
source of truth TIMEOUT_SECONDS; change the message in the LOGGER.info call to
compute the human-readable duration from TIMEOUT_SECONDS (e.g., derive minutes
using integer division or rounding) and interpolate that computed value instead
of the literal "5-minute" while keeping vmi_name and context_msg unchanged.
- Around line 41-43: The parameter name on monitor_vmi_interfaces is misleading:
rename the parameter from vmi to vm to match its BaseVirtualMachine type, update
all internal references (replace vmi.vmi.name and vmi.namespace with vm.vmi.name
and vm.namespace), and adjust any local references in monitor_vmi_interfaces
accordingly; also update call sites that pass the argument by position or name
(e.g., test_ip_persistence.py where callers use vmi=...) to use the new
parameter name or continue using the named arg as appropriate.
- Around line 51-70: The watch iterator created via watcher = watch.Watch() and
started by vmi_resource.watch(...) can leak the underlying HTTP stream if you
break early; wrap the for loop in a try/finally and call watcher.stop() in the
finally block to ensure the watch is stopped deterministically when the timeout
branch (time.time() - start_time >= TIMEOUT_SECONDS) triggers or any other early
exit occurs, leaving the rest of the loop logic (event_type, vmi_obj, iteration
increment and logging) unchanged.

In `@tests/network/vmi/test_ip_persistence.py`:
- Around line 11-21: Add the tier3 pytest marker to both tests by decorating
test_ip_address_persistence and test_ip_address_persistence_after_migration with
`@pytest.mark.tier3` (in addition to the existing `@pytest.mark.polarion`
decorators) so the functions monitor_vmi_interfaces and migrate_vm_and_verify
runs are classified as tier3; ensure the decorator appears immediately above
each def to match existing style.

Comment on lines 11 to 21
@pytest.mark.polarion("CNV-14339")
def test_ip_address_persistence(vm_single_nic_with_pod):
monitor_vmi_interfaces(vmi=vm_single_nic_with_pod)


@pytest.mark.polarion("CNV-14340")
def test_ip_address_persistence_after_migration(vm_single_nic_with_pod):
migrate_vm_and_verify(vm=vm_single_nic_with_pod)
LOGGER.info(f"VM {vm_single_nic_with_pod.vmi.name} migrated to node {vm_single_nic_with_pod.vmi.node.name}")

monitor_vmi_interfaces(vmi=vm_single_nic_with_pod, context="after migration")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check pytest.ini, pyproject.toml, setup.cfg for marker registration
echo "=== Checking pytest.ini ==="
if [ -f pytest.ini ]; then
  cat pytest.ini
else
  echo "pytest.ini not found"
fi

echo -e "\n=== Checking pyproject.toml (markers section) ==="
if [ -f pyproject.toml ]; then
  grep -A 20 "\[tool.pytest" pyproject.toml || echo "No pytest tool section"
else
  echo "pyproject.toml not found"
fi

echo -e "\n=== Checking setup.cfg ==="
if [ -f setup.cfg ]; then
  cat setup.cfg
else
  echo "setup.cfg not found"
fi

Repository: RedHatQE/openshift-virtualization-tests

Length of output: 4797


🏁 Script executed:

#!/bin/bash
# Verify the actual test file content
echo "=== Current state of tests/network/vmi/test_ip_persistence.py ==="
cat -n tests/network/vmi/test_ip_persistence.py | head -30

Repository: RedHatQE/openshift-virtualization-tests

Length of output: 991


🏁 Script executed:

#!/bin/bash
# Count tier3 usage across the repository to confirm prevalence
echo "=== Tier3 marker usage count ==="
rg "pytest\.mark\.tier3" --type=py | wc -l

echo -e "\n=== Example tier3 usage (first 5) ==="
rg "pytest\.mark\.tier3" --type=py -B1 | head -20

echo -e "\n=== Check if vm_single_nic_with_pod fixture exists and what it does ==="
rg "def vm_single_nic_with_pod" --type=py -A5

Repository: RedHatQE/openshift-virtualization-tests

Length of output: 1550


🏁 Script executed:

#!/bin/bash
# Examine monitor_vmi_interfaces to understand test complexity
echo "=== monitor_vmi_interfaces implementation ==="
cat -n tests/network/vmi/libippersistence.py | head -50

echo -e "\n=== Check for timeout/sleep behavior in monitor function ==="
rg "timeout|sleep|sampler" tests/network/vmi/libippersistence.py -C3

Repository: RedHatQE/openshift-virtualization-tests

Length of output: 2490


🏁 Script executed:

#!/bin/bash
# Check migrate_vm_and_verify function
echo "=== migrate_vm_and_verify implementation ==="
rg "def migrate_vm_and_verify" utilities/virt.py -A 15

echo -e "\n=== Check timeout in migrate_vm_and_verify ==="
rg "timeout|migrate" utilities/virt.py | grep -i "migrate_vm_and_verify" -A 5 | head -20

Repository: RedHatQE/openshift-virtualization-tests

Length of output: 1149


🏁 Script executed:

#!/bin/bash
# Examine vm_single_nic_with_pod fixture setup
echo "=== vm_single_nic_with_pod fixture ==="
cat -n tests/network/vmi/conftest.py | sed -n '1,100p'

Repository: RedHatQE/openshift-virtualization-tests

Length of output: 4115


Add required tier3 markers to both tests.

These tests qualify as tier3: they provision bridge devices and a Fedora VM with cloud-init, wait for guest agent readiness and IP configuration (via lookup_iface_status), then run a 5-minute monitoring loop. One test additionally triggers a live migration (default 12-minute timeout). This combination of complexity, hardware-specific setup, and long runtime (5–17 minutes) makes them unambiguously tier3-warranted per coding guidelines.

Proposed fix
+@pytest.mark.tier3
 `@pytest.mark.polarion`("CNV-14339")
 def test_ip_address_persistence(vm_single_nic_with_pod):
     monitor_vmi_interfaces(vmi=vm_single_nic_with_pod)


+@pytest.mark.tier3
 `@pytest.mark.polarion`("CNV-14340")
 def test_ip_address_persistence_after_migration(vm_single_nic_with_pod):
     migrate_vm_and_verify(vm=vm_single_nic_with_pod)
🧰 Tools
🪛 Ruff (0.15.1)

[warning] 19-19: Logging statement uses f-string

(G004)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/network/vmi/test_ip_persistence.py` around lines 11 - 21, Add the tier3
pytest marker to both tests by decorating test_ip_address_persistence and
test_ip_address_persistence_after_migration with `@pytest.mark.tier3` (in addition
to the existing `@pytest.mark.polarion` decorators) so the functions
monitor_vmi_interfaces and migrate_vm_and_verify runs are classified as tier3;
ensure the decorator appears immediately above each def to match existing style.

@openshift-virtualization-qe-bot

D/S test tox -e verify-tc-requirement-polarion failed: cnv-tests-tox-executor/25716



@pytest.fixture(scope="module")
def bridge_devices(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit
usually we call it nncp, not devices. But if other folks are ok with it, I am ok as well. It is related to nad resource as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
I also thought about it in the first draft, when network_device was used before it was replaced with the libnncp.NodeNetworkConfigurationPolicy, but now it is even more blunt.
You created an NNCP object, so this is what you have to yield, and not a name that implies something else (the bridge device itself).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And now I see you already fixed it.
Thank you.

]
),
node_selector={WORKER_NODE_LABEL_KEY: ""},
) as br:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit
in fact this is nncp object, not br, can be combined as nncp_br, for example. I think it is good to align it with our fresh recent usages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

) as vm:
vm.start(wait=True)
vm.wait_for_agent_connected()
lookup_iface_status(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again this. There is still no explanation if we should test primary, secondary, both. The scenario that we have to test primary and secondary, right? I just want to understand it finally, the description is still unclear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added



@pytest.fixture(scope="module")
def linux_bridge_vm_for_ip_persist(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit
as I mentioned already here, I am not in favor of this _for_ip_persist. But I don't want to fight a lot. If all others are ok, let it be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll keep it then

@pytest.mark.polarion("CNV-14340")
def test_ip_address_persistence_after_migration(linux_bridge_vm_for_ip_persist):
migrate_vm_and_verify(vm=linux_bridge_vm_for_ip_persist)
LOGGER.info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again this. Moreover, I don't figure why this log is needed. You have explicit migration action that will be presented in the log, why are you adding this log message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped the logger here

return cloudinit.EthernetDevice(addresses=addresses)


def ip_persist_vm(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean ip_persist_vm? what state this naming is describing? how ip_persist_vm is different from just a VM with l2 bridge secondary iface? BTW, this VM fabric can be moved in network/libs/vm_factory.py and used then in other places (as a follow up).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed and will add to a follow-up

from tests.network.libs.ip import random_ipv4_address, random_ipv6_address

LOGGER = logging.getLogger(__name__)
WATCH_TEST_TIMEOUT_SECONDS = 300
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_TEST_ should be removed, as for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 100 to 116
interfaces = vmi.instance.status.interfaces

for event in vmi.watcher(timeout=WATCH_TEST_TIMEOUT_SECONDS):
if time.time() - start_time >= WATCH_TEST_TIMEOUT_SECONDS:
LOGGER.info(
f"VMI {vmi.name} maintained stable ipAddress fields for"
f" {WATCH_TEST_TIMEOUT_SECONDS} seconds{context_msg}"
)
break

if event["type"] != "MODIFIED":
continue

interfaces = event["object"].status.interfaces
LOGGER.info(f"Event: VMI {vmi.name} status updated{context_msg}, interfaces: {interfaces}")

yield interfaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@azhivovk does it make sense?

Mulitple users have encountered events in which
vmi interfaces or ipAddress field disappeared.

On setup we wait for all VM interfaces to stabilize
before verifying IP persistence.
During tests, using watch kuberenetes API, which
is used by wrapper's watcher, we monitor for such
events and verify all interfaces are persistent.

Signed-off-by: Asia Khromov <azhivovk@redhat.com>
@azhivovk
Copy link
Contributor Author

Changes:

  • Renaming
  • Added @pytest.mark.incremental



@pytest.fixture(scope="module")
def bridge_devices(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
I also thought about it in the first draft, when network_device was used before it was replaced with the libnncp.NodeNetworkConfigurationPolicy, but now it is even more blunt.
You created an NNCP object, so this is what you have to yield, and not a name that implies something else (the bridge device itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants