Skip to content

Conversation

@sunzhaohua2
Copy link
Contributor

@sunzhaohua2 sunzhaohua2 commented Oct 14, 2025

Summary by CodeRabbit

  • Tests

    • Added extensive end-to-end tests for bidirectional status translation between Machine API and Cluster API (versions, conditions, provider status, error propagation, edge cases) with additional assertions, cleanup, and same-name resource scenarios.
    • Enhanced test suites to use the Ginkgo/Gomega DSL more broadly for clearer, scoped test behavior.
  • Chores

    • Increased e2e test timeout to accommodate longer-running scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 14, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 14, 2025

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

/hold
For machineset migration split pr raise and merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sunzhaohua2 sunzhaohua2 marked this pull request as draft October 14, 2025 08:09
@coderabbitai
Copy link

coderabbitai bot commented Oct 14, 2025

📝 Walkthrough

Walkthrough

Adds a large end-to-end test suite exercising bidirectional status conversion between Machine API (MAPI) and Cluster API (CAPI), introduces YAML-based providerStatus parsing and a providerStatus verification helper, inserts GinkgoHelper() calls in multiple e2e helpers, and increases the e2e Makefile timeout.

Changes

Cohort / File(s) Summary
Status conversion e2e tests
e2e/status_conversion_test.go
New comprehensive E2E test covering MAPI ↔ CAPI status translation for MachineSets and Machines: replicas, ready/available/observedGeneration, selector translation, providerStatus (AWSMachine) parsing, condition synchronization (v1beta1/v1beta2), error/failure propagation, nodeRef/addresses mapping, and multiple edge/cross-name scenarios.
Machine migration helpers
e2e/machine_migration_helpers.go
Added sigs.k8s.io/yaml import and two helpers: getAWSProviderStatusFromMachine (extract/unmarshal AWSMachineProviderStatus from Machine.Status.ProviderStatus.Raw) and verifyMAPIMachineProviderStatus (Gomega transformer comparing MAPI providerStatus to AWSMachine status). Inserted GinkgoHelper() calls at multiple helper call sites; nil checks and YAML unmarshalling added.
Machineset migration helpers
e2e/machineset_migration_helpers.go
Inserted GinkgoHelper() at the top of numerous helper functions to enable Ginkgo v2 DSL use; no signature or behavioral changes.
CI / Makefile
Makefile
Increased e2e target timeout from 120m to 180m.

Sequence Diagram(s)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰
I nibble YAML, hopping through each test,
GinkgoHelper keeps my whiskers blessed,
Machines and Sets trade status in time,
I stamp a paw and mark each passing line,
A tiny rabbit cheers the status rhyme.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 74.07% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: implementing e2e testing automation for status syncing between MAPI and CAPI, which matches the comprehensive test suite additions and helper functions throughout the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Oct 14, 2025
@openshift-ci openshift-ci bot requested review from RadekManak and mdbooth October 14, 2025 08:11
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 21, 2025

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

/hold
Waiting for machineset migration split pr to merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 22, 2025

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sunzhaohua2 sunzhaohua2 marked this pull request as ready for review December 22, 2025 15:04
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 22, 2025
@openshift-ci openshift-ci bot requested a review from damdo December 22, 2025 15:04
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 22, 2025

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • Tests
  • Added comprehensive end-to-end test suite validating status field translation and synchronization across API versions.
  • Introduced testing helpers for enhanced provider status verification and error scenario handling.
  • Extended test coverage for both success paths and failure scenarios in resource conversion workflows.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
e2e/machine_migration_helpers.go (1)

346-354: Using Expect inside a transform function defeats Eventually retry semantics.

When getAWSProviderStatusFromMachine is used as a transform function within Eventually, any Expect failure will cause an immediate test failure instead of allowing Eventually to retry. If ProviderStatus.Raw is nil or YAML unmarshalling fails on an early attempt (before status is populated), the test fails prematurely.

Consider returning an error-indicating value that the matcher can handle, or restructure to allow retries.

🔎 Proposed fix
 // getAWSProviderStatusFromMachine extracts and unmarshals the AWSMachineProviderStatus from a MAPI Machine.
 func getAWSProviderStatusFromMachine(mapiMachine *mapiv1beta1.Machine) *mapiv1beta1.AWSMachineProviderStatus {
-	Expect(mapiMachine.Status.ProviderStatus.Raw).ToNot(BeNil())
+	if mapiMachine.Status.ProviderStatus == nil || mapiMachine.Status.ProviderStatus.Raw == nil {
+		return nil
+	}
 
 	providerStatus := &mapiv1beta1.AWSMachineProviderStatus{}
-	Expect(yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, providerStatus)).To(Succeed())
+	if err := yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, providerStatus); err != nil {
+		return nil
+	}
 
 	return providerStatus
 }

Then update the matcher usage to handle nil:

verifyMAPIMachineProviderStatus(mapiMachine,
    SatisfyAll(
        Not(BeNil()),
        HaveField("InstanceState", HaveValue(Equal(...))),
    ),
)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 10f36ad and b56da67.

📒 Files selected for processing (2)
  • e2e/machine_migration_helpers.go
  • e2e/status_conversion_test.go
🔇 Additional comments (12)
e2e/machine_migration_helpers.go (1)

22-22: LGTM!

Import addition is appropriate for the new YAML unmarshalling functionality.

e2e/status_conversion_test.go (11)

1-21: LGTM!

Imports are well-organized and appropriate for the comprehensive status conversion testing.


23-32: LGTM!

Proper platform and feature gate guards ensure tests only run in appropriate environments.


34-57: LGTM!

Well-structured test setup with proper cleanup via DeferCleanup.


66-84: Consider potential stale reference issue.

Line 75 compares against mapiMachineSet.Status.ReadyReplicas which is captured when the Eventually on line 68-71 completes. However, since komega.Object(mapiMachineSet) updates the object in place, this should work correctly. Just noting this pattern for awareness.


138-150: Good documentation.

The comment clearly explains why only the Paused condition is verified for paused MachineSets.


461-491: Good practice: Pending tests with bug references.

Using PIt with explicit bug URLs (OCPBUGS-63340) is the right approach for known issues that are being tracked.


557-568: Address comparison assumes ordering.

The index-based address comparison (cm.Status.Addresses[i] vs mapiMachine.Status.Addresses[i]) will fail if addresses are returned in different orders between MAPI and CAPI. If ordering isn't guaranteed by the conversion logic, consider using ContainElements or sorting before comparison.

This is acceptable if the conversion preserves order, but worth verifying.


647-667: LGTM!

Good coverage of providerStatus conversion including instanceState verification.


765-791: Pending test appropriately marked with bug reference.

The PIt with OCPBUGS-70136 reference correctly marks this as pending until the underlying bug is fixed.


777-790: Note: This test uses the helper function with the Expect issue.

When this pending test is enabled, be aware that verifyMAPIMachineProviderStatus uses getAWSProviderStatusFromMachine which has Expect assertions inside a transform function. Applying the fix suggested in machine_migration_helpers.go will be needed.


805-816: Good use of PIt for pending tests with bug tracking.

Multiple tests appropriately marked as pending with OCPBUGS-63183 reference for the same-name scenario.

Also applies to: 832-842, 844-869

@sunzhaohua2
Copy link
Contributor Author

/test e2e-aws-capi-techpreview

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 23, 2025

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • Tests
  • Added a comprehensive end-to-end test suite validating status translation and synchronization between Machine API and Cluster API across many scenarios (replicas, conditions, phases, provider status, error propagation).
  • Introduced testing helpers to verify provider status transformations and YAML-based providerStatus decoding.
  • Expanded coverage for success paths, failure scenarios, cross-namespace and cross-authority conversions.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
e2e/status_conversion_test.go (2)

557-570: Consider order-independent address comparison.

The current comparison assumes addresses are in the same order in both MAPI and CAPI. If address ordering is not guaranteed to be preserved during conversion, this could cause flaky test failures.

🔎 Suggested order-independent comparison
 Eventually(komega.Object(capiMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should(
     WithTransform(func(cm *clusterv1.Machine) bool {
         if len(cm.Status.Addresses) != len(mapiMachine.Status.Addresses) {
             return false
         }
-        for i, mapiAddr := range mapiMachine.Status.Addresses {
-            capiAddr := cm.Status.Addresses[i]
-            if string(capiAddr.Type) != string(mapiAddr.Type) || capiAddr.Address != mapiAddr.Address {
-                return false
+        for _, mapiAddr := range mapiMachine.Status.Addresses {
+            found := false
+            for _, capiAddr := range cm.Status.Addresses {
+                if string(capiAddr.Type) == string(mapiAddr.Type) && capiAddr.Address == mapiAddr.Address {
+                    found = true
+                    break
+                }
+            }
+            if !found {
+                return false
             }
         }
         return true
     }, BeTrue()),

736-749: Consider order-independent address comparison here as well.

Same concern as the MAPI→CAPI address comparison - the index-based comparison assumes preserved ordering.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between b56da67 and 44fc78c.

📒 Files selected for processing (2)
  • e2e/machine_migration_helpers.go
  • e2e/status_conversion_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • e2e/machine_migration_helpers.go
🧰 Additional context used
🧬 Code graph analysis (1)
e2e/status_conversion_test.go (5)
e2e/framework/util.go (1)
  • IsMachineAPIMigrationEnabled (41-76)
pkg/conversion/mapi2capi/interface.go (2)
  • MachineSet (29-31)
  • Machine (24-26)
e2e/framework/framework.go (7)
  • WaitLong (25-25)
  • RetryLong (19-19)
  • WaitMedium (24-24)
  • RetryMedium (18-18)
  • WaitShort (23-23)
  • RetryShort (17-17)
  • CAPINamespace (14-14)
e2e/framework/machine.go (2)
  • GetMachine (75-86)
  • GetAWSMachine (61-72)
e2e/framework/machineset.go (3)
  • WaitForMachineSet (137-173)
  • ScaleCAPIMachineSet (224-235)
  • GetNewestMachineFromMachineSet (204-221)
🔇 Additional comments (8)
e2e/status_conversion_test.go (8)

1-21: LGTM!

Imports are well-organized and appropriate for the e2e test suite. The use of dot imports for Ginkgo/Gomega is a common pattern in Kubernetes e2e tests.


23-32: LGTM!

Good use of skip conditions to ensure tests only run on AWS platforms with the MachineAPIMigration feature enabled. The Ordered decorator ensures deterministic test execution.


66-84: LGTM!

The pattern of waiting for MAPI status to stabilize and then verifying CAPI synchronization is correct. The komega.Object() matcher refreshes the object on each poll, ensuring fresh values are used.


205-230: LGTM!

The error status conversion test correctly handles the case where either ErrorReason or ErrorMessage (or both) may be set. The conditional assertions ensure proper matching without failing if only one field is populated.


461-490: LGTM on pending tests with bug tracking.

Good practice to mark tests as pending (PIt) with inline bug references (OCPBUGS-63340). This maintains test coverage visibility while avoiding false negatives until the underlying issues are resolved.


794-803: LGTM!

The cross-authority scenario (CAPI Machine created first, then MAPI with ClusterAPI authority) is properly set up. Individual test cases appropriately wait for the required status fields to be populated.


34-246: Well-structured test suite for MAPI→CAPI MachineSet conversions.

The test organization is clean with:

  • Proper resource lifecycle management (BeforeAll setup, DeferCleanup teardown)
  • Clear separation between normal operation and error condition tests
  • Comprehensive coverage of status fields, conditions, and selector conversion

304-309: Access to Deprecated.V1Beta1 should use defensive patterns or explicit nil checks.

Direct access to capiMachineSet.Status.Deprecated.V1Beta1.FullyLabeledReplicas at lines 306 and 397 may panic if the pointer is not initialized. Use HaveValue() matcher (as line 81 does for ReadyReplicas) or add a nil guard in test setup to ensure the field is initialized before the test assertions.

Comment on lines +658 to +645
By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState")
var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus
err = yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, &mapiProviderStatus)
Expect(err).ToNot(HaveOccurred(), "failed to unmarshal MAPI Machine providerStatus")

Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should(
HaveField("Status.InstanceState", HaveValue(Equal(awsv1.InstanceState(ptr.Deref(mapiProviderStatus.InstanceState, ""))))),
"Should have AWSMachine status.instanceState match MAPI providerStatus.instanceState",
)
Copy link

@coderabbitai coderabbitai bot Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add nil check for ProviderStatus before accessing .Raw.

Accessing mapiMachine.Status.ProviderStatus.Raw without verifying ProviderStatus is non-nil could cause a panic if the MAPI Machine lacks provider status.

🔎 Proposed fix
 By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState")
 var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus
+Expect(mapiMachine.Status.ProviderStatus).NotTo(BeNil(), "MAPI Machine should have providerStatus")
 err = yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, &mapiProviderStatus)
 Expect(err).ToNot(HaveOccurred(), "failed to unmarshal MAPI Machine providerStatus")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState")
var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus
err = yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, &mapiProviderStatus)
Expect(err).ToNot(HaveOccurred(), "failed to unmarshal MAPI Machine providerStatus")
Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should(
HaveField("Status.InstanceState", HaveValue(Equal(awsv1.InstanceState(ptr.Deref(mapiProviderStatus.InstanceState, ""))))),
"Should have AWSMachine status.instanceState match MAPI providerStatus.instanceState",
)
By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState")
var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus
Expect(mapiMachine.Status.ProviderStatus).NotTo(BeNil(), "MAPI Machine should have providerStatus")
err = yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, &mapiProviderStatus)
Expect(err).ToNot(HaveOccurred(), "failed to unmarshal MAPI Machine providerStatus")
Eventually(komega.Object(awsMachine), capiframework.WaitMedium, capiframework.RetryMedium).Should(
HaveField("Status.InstanceState", HaveValue(Equal(awsv1.InstanceState(ptr.Deref(mapiProviderStatus.InstanceState, ""))))),
"Should have AWSMachine status.instanceState match MAPI providerStatus.instanceState",
)
🤖 Prompt for AI Agents
e2e/status_conversion_test.go around lines 658-666: add a nil-check for
mapiMachine.Status.ProviderStatus before accessing .Raw to avoid a panic;
specifically, insert an assertion like
Expect(mapiMachine.Status.ProviderStatus).ToNot(BeNil(), "expected MAPI Machine
providerStatus to be set") immediately before the yaml.Unmarshal call (or guard
with an if that fails the test with a clear message) and only call
yaml.Unmarshal when ProviderStatus is non-nil so the test fails cleanly instead
of panicking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should never get here, but the check does not hurt :)

cc @sunzhaohua2

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@sunzhaohua2
Copy link
Contributor Author

/test unit

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 23, 2025

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • Tests
  • Added a comprehensive end-to-end test suite validating status translation and synchronization between Machine API and Cluster API across many scenarios (replicas, conditions, phases, provider status, error propagation).
  • Introduced helpers to verify provider-status transformations and YAML-based providerStatus decoding; expanded coverage for success, failure, cross-namespace and cross-authority conversions.
  • Chores
  • Extended e2e test timeout to accommodate longer-running scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
Makefile (1)

60-60: Timeout increase appears reasonable given the new test coverage.

The 50% increase in e2e test timeout (120m → 180m) aligns with the comprehensive status conversion test suite being added. However, consider verifying that the full runtime approaches this limit, as unnecessarily long timeouts can delay failure feedback in CI.

Optional: If the new tests are the primary driver of increased runtime, consider whether any tests could be parallelized or optimized to reduce overall execution time.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 44fc78c and ff70d3f.

📒 Files selected for processing (1)
  • Makefile

@sunzhaohua2
Copy link
Contributor Author

/test e2e-aws-capi-techpreview

@shellyyang1989
Copy link

@sunzhaohua2 is it ready for review?

@sunzhaohua2
Copy link
Contributor Author

@sunzhaohua2 is it ready for review?

yes

})

It("should convert status.availableReplicas from MAPI to CAPI status and status.Deprecated.V1Beta1", func() {
By("Verifying CAPI MachineSet status.availableReplicas is synchronized")
Copy link
Contributor

@theobarberbany theobarberbany Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this (thinking out loud) - we're waiting for the field to be populated, and once it is we are stating it's synchronised. This works because we start with no CAPI machine set, right, so population == sync?

Should we be looking at SynchronizedGeneration too? Given that the By( says we're waiting for synchronisation (and that's a field in status)? We likely only need to do it once, in the BeforeAll? and it may speed up the tests a little?

})

// bug https://issues.redhat.com/browse/OCPBUGS-63183
It("should convert CAPI Machine phase to MAPI Machine phase", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a PIt() ? Or has the bug been resolved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) I removed the comment, we hit the bug only when the capi machine exists

.PHONY: e2e
e2e: ## Run e2e tests against active kubeconfig
./hack/test.sh "./e2e/..." 120m
./hack/test.sh "./e2e/..." 180m
Copy link
Contributor

@theobarberbany theobarberbany Jan 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:'(

Should we consider splitting the e2es / sharing early?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theobarberbany I raised a pr here to add shard_count openshift/release#73182

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theobarberbany I'd like to keep it as is for now to align with the job's time setting, I'll add the shard_count parameter to the job after the OTE pr is merged.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 7, 2026

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • Tests
  • Added comprehensive end-to-end test coverage for Machine API to Cluster API status conversion scenarios, including status field validation, condition synchronization, error handling, and multi-version compatibility verification.
  • Enhanced test helper utilities to improve test reliability and coverage for migration scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
e2e/machineset_migration_helpers.go (1)

356-382: Consider adding GinkgoHelper() to cleanup function for consistency.

While cleanupMachineSetTestResources doesn't make assertions, it uses Ginkgo's By() DSL. For consistency with other helper functions in this file that use the Ginkgo DSL, consider adding GinkgoHelper() at the start of this function.

📝 Proposed addition
 func cleanupMachineSetTestResources(ctx context.Context, cl client.Client, capiMachineSets []*clusterv1.MachineSet, awsMachineTemplates []*awsv1.AWSMachineTemplate, mapiMachineSets []*mapiv1beta1.MachineSet) {
+	GinkgoHelper()
 	for _, ms := range capiMachineSets {
e2e/machine_migration_helpers.go (1)

289-306: Consider adding GinkgoHelper() to cleanup function for consistency.

While cleanupMachineResources doesn't make assertions, it uses Ginkgo's By() DSL. For consistency with other helper functions in this file that use the Ginkgo DSL, consider adding GinkgoHelper() at the start of this function.

📝 Proposed addition
 func cleanupMachineResources(ctx context.Context, cl client.Client, capiMachines []*clusterv1.Machine, mapiMachines []*mapiv1beta1.Machine) {
+	GinkgoHelper()
 	for _, m := range capiMachines {
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between ff70d3f and 1925b72.

📒 Files selected for processing (3)
  • e2e/machine_migration_helpers.go
  • e2e/machineset_migration_helpers.go
  • e2e/status_conversion_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • e2e/status_conversion_test.go
🔇 Additional comments (4)
e2e/machineset_migration_helpers.go (1)

26-26: LGTM: GinkgoHelper() additions improve test failure reporting.

The systematic addition of GinkgoHelper() calls to these helper functions aligns with Ginkgo v2 best practices and ensures that test failures are reported at the calling site rather than inside the helper, making debugging easier.

Also applies to: 57-57, 89-89, 99-99, 109-109, 176-176, 197-197, 230-230, 239-239, 250-250, 268-268, 292-292, 307-307, 324-324, 344-344

e2e/machine_migration_helpers.go (3)

22-22: LGTM: YAML import added for providerStatus unmarshaling.

The addition of the sigs.k8s.io/yaml import supports the new getAWSProviderStatusFromMachine helper function.


26-26: LGTM: GinkgoHelper() additions improve test failure reporting.

The systematic addition of GinkgoHelper() calls to these helper functions aligns with Ginkgo v2 best practices and ensures that test failures are reported at the calling site rather than inside the helper, making debugging easier.

Also applies to: 101-101, 146-146, 169-169, 178-178, 211-211, 224-224, 309-309, 316-316


347-354: LGTM: New helper function follows established patterns.

The verifyMAPIMachineProviderStatus function correctly uses GinkgoHelper(), Eventually with komega.Object, and a transformer pattern consistent with similar helpers in the codebase (e.g., verifyMAPIMachineSetProviderSpec in machineset_migration_helpers.go).

@damdo
Copy link
Member

damdo commented Jan 8, 2026

/test okd-scos-images

@openshift openshift deleted a comment from openshift-ci bot Jan 8, 2026
@damdo damdo requested review from nrb and theobarberbany January 8, 2026 13:18
@damdo
Copy link
Member

damdo commented Jan 8, 2026

@nrb @theobarberbany PTAL

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 23, 2026

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • Tests

  • Added extensive end-to-end tests covering bidirectional status translation between Machine API and Cluster API across versions, conditions, provider status, error propagation, and edge cases.

  • Expanded and improved test helpers to enhance reliability and coverage for migration and synchronization scenarios.

  • Chores

  • Increased e2e test timeout to accommodate longer-running scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@e2e/status_conversion_test.go`:
- Around line 317-322: The test captures
capiMachineSet.Status.Deprecated.V1Beta1.FullyLabeledReplicas outside the
Eventually which can panic if Deprecated or V1Beta1 is nil; modify the spec that
uses Eventually(komega.Object(mapiMachineSet)...) / HaveField to instead perform
the comparison inside the Eventually closure so both values are read at
assertion time with nil checks (e.g. replace the HaveField call with an
Eventually(func() (actual, expected interface{}) { read mapiMachineSet.Status
and capiMachineSet.Status safely, return sentinel or zero if Deprecated/V1Beta1
is nil }, ...).Should(Equal(...)) or use WithTransform on
komega.Object(mapiMachineSet) with a transform that defensively extracts
Status.Deprecated.V1Beta1.FullyLabeledReplicas and compares to a value fetched
inside the Eventually; apply the same change for the similar assertion at line
418.
♻️ Duplicate comments (1)
e2e/status_conversion_test.go (1)

627-635: Add nil check for ProviderStatus before accessing .Raw.

Accessing mapiMachine.Status.ProviderStatus.Raw without verifying ProviderStatus is non-nil could cause a panic if the MAPI Machine lacks provider status at this point.

🐛 Proposed fix
 By("Verifying AWSMachine status.instanceState matches MAPI Machine providerStatus.instanceState")
 var mapiProviderStatus mapiv1beta1.AWSMachineProviderStatus
+Expect(mapiMachine.Status.ProviderStatus).NotTo(BeNil(), "MAPI Machine should have providerStatus")
 err = yaml.Unmarshal(mapiMachine.Status.ProviderStatus.Raw, &mapiProviderStatus)
 Expect(err).ToNot(HaveOccurred(), "failed to unmarshal MAPI Machine providerStatus")
🧹 Nitpick comments (1)
e2e/status_conversion_test.go (1)

34-34: Consider removing redundant var _ = in nested Describe blocks.

The var _ = Describe(...) pattern is typically used at package scope to register specs. Inside an existing Describe block (line 23), you can use Describe(...) directly without the variable assignment.

♻️ Suggested simplification
-	var _ = Describe("MAPI to CAPI MachineSet Status Conversion", Ordered, func() {
+	Describe("MAPI to CAPI MachineSet Status Conversion", Ordered, func() {

This applies to lines 34, 257, 455, and 640.

@sunzhaohua2
Copy link
Contributor Author

@nrb @theobarberbany I have addressed all the comments, please help to take a look again, thanks!

@sunzhaohua2
Copy link
Contributor Author

/test e2e-aws-capi-techpreview

@sunzhaohua2
Copy link
Contributor Author

/test e2e-aws-capi-techpreview

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 27, 2026

@sunzhaohua2: This pull request references OCPCLOUD-3017 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • Tests

  • Added extensive end-to-end tests for bidirectional status translation between Machine API and Cluster API (versions, conditions, provider status, error propagation, edge cases) with additional assertions, cleanup, and same-name resource scenarios.

  • Enhanced test suites to use the Ginkgo/Gomega DSL more broadly for clearer, scoped test behavior.

  • Chores

  • Increased e2e test timeout to accommodate longer-running scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@e2e/machine_migration_helpers.go`:
- Around line 347-364: The transform getAWSProviderStatusFromMachine currently
contains hard Expect assertions which can panic if
mapiMachine.Status.ProviderStatus is nil and thus break Eventually retries;
update getAWSProviderStatusFromMachine to guard against a nil ProviderStatus or
nil ProviderStatus.Raw by returning nil early (instead of calling Expect),
attempt yaml.Unmarshal only when Raw is non-nil and if unmarshal fails return
nil (do not call Expect), and let verifyMAPIMachineProviderStatus continue to
call Eventually(WithTransform(getAWSProviderStatusFromMachine, matcher)) so
Eventually can retry until a non-nil, successfully unmarshalled
*mapiv1beta1.AWSMachineProviderStatus is returned; reference functions:
getAWSProviderStatusFromMachine and verifyMAPIMachineProviderStatus.

In `@e2e/status_conversion_test.go`:
- Around line 520-528: Replace the immediate Expect check on
mapiMachine.Status.LastUpdated with a polling wait so the test does not race;
specifically, in the test function that checks MAPI→CAPI conversion replace the
Expect(mapiMachine.Status.LastUpdated).NotTo(BeNil()) with an Eventually that
polls k8s for mapiMachine (or uses komega.Object(mapiMachine)) and asserts
HaveField("Status.LastUpdated", Not(BeNil())) before asserting the CAPI Machine
lastUpdated; keep the subsequent Eventually for capiMachine.Status.LastUpdated
but ensure it compares Equal(*mapiMachine.Status.LastUpdated) only after the
mapiMachine lastUpdated has been observed.

Copy link
Contributor

@nrb nrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests look good to me! I have one minor comment, but no objections to merging.

/lgtm
/approve


By("Updating MAPI Machine with invalid instanceType to trigger error")
updateAWSMachineSetProviderSpec(ctx, cl, mapiMachineSet, func(providerSpec *mapiv1beta1.AWSMachineProviderConfig) {
providerSpec.InstanceType = "invalid"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having another context that tests other invalid configurations?

I don't think we need to add anything else in this PR, but thinking out loud, it might be good to capture issues as e2e cases like this as we encounter them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nrb good idea, I created a card to trace the invalid configurations testing.https://issues.redhat.com/browse/OCPCLOUD-3353

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 29, 2026
@openshift-ci-robot
Copy link

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aws-capi-techpreview
/test e2e-aws-ovn
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-techpreview-upgrade
/test e2e-azure-capi-techpreview
/test e2e-azure-ovn-techpreview
/test e2e-azure-ovn-techpreview-upgrade
/test e2e-gcp-capi-techpreview
/test e2e-gcp-ovn-techpreview
/test e2e-metal3-capi-techpreview
/test e2e-openstack-capi-techpreview
/test e2e-openstack-ovn-techpreview
/test e2e-vsphere-capi-techpreview
/test regression-clusterinfra-aws-ipi-techpreview-capi

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 29, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nrb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 29, 2026
@sunzhaohua2
Copy link
Contributor Author

/retest

@damdo
Copy link
Member

damdo commented Jan 30, 2026

It looks like this increases ci/prow/e2e-aws-capi-techpreview e2e duration from ~2h55m to ~4h10m.

I wonder if there is anything we can do to reduce this duration a bit

@sunzhaohua2
Copy link
Contributor Author

It looks like this increases ci/prow/e2e-aws-capi-techpreview e2e duration from ~2h55m to ~4h10m.

I wonder if there is anything we can do to reduce this duration a bit

let me try to refactor this, such as create multi machinesets/machines in parallel to reduce time

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 30, 2026

@sunzhaohua2: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn b493e38 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-openstack-ovn-techpreview 49a6d93 link true /test e2e-openstack-ovn-techpreview

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants