Skip to content

Comments

Add EtcdBackup CRD enhancement for OADP integration#1945

Open
jparrill wants to merge 1 commit intoopenshift:masterfrom
jparrill:CNTRLPLANE-2676
Open

Add EtcdBackup CRD enhancement for OADP integration#1945
jparrill wants to merge 1 commit intoopenshift:masterfrom
jparrill:CNTRLPLANE-2676

Conversation

@jparrill
Copy link

Summary

  • Introduces a new EtcdBackup CRD in the hypershift.openshift.io/v1beta1 API group
  • Serves as the contract between the OADP HyperShift plugin and the Hypershift Operator for triggering etcd backups
  • Controller in the HO orchestrates snapshot and upload Jobs while keeping management credentials isolated from HCP namespaces

Details

  • Tracking: https://issues.redhat.com/browse/CNTRLPLANE-2676
  • Status: provisional
  • CRD: EtcdBackup (namespaced, in HCP namespace)
  • Key design decision: Backup Jobs run in the HO namespace to keep AWS/S3 credentials isolated from customer-scoped HCP namespaces

Test plan

  • Unit tests: Controller reconcile logic, Job manifest construction, Secret copy/cleanup, status condition updates
  • Integration tests: Create EtcdBackup CR, verify Job creation, simulate completion, verify status
  • E2E tests: Full backup flow with real HostedCluster

🤖 Generated with Claude Code

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 19, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jparrill
Copy link
Author

If you have time @sjenning @enxebre @csrwng @muraee @bryan-cox, please review 🙏 .

Introduces a new EtcdBackup CRD in the hypershift.openshift.io/v1beta1
API group that serves as the contract between the OADP HyperShift plugin
and the Hypershift Operator for triggering etcd backups. A controller in
the HO orchestrates snapshot and upload Jobs while keeping management
credentials isolated from HCP namespaces.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Juan Manuel Parrilla Madrid <jparrill@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 19, 2026

@jparrill: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


### Workflow Description

1. The OADP plugin (running as a Velero pre-hook or standalone pod) creates an `EtcdBackup` CR in the HCP namespace. The CR spec includes S3 bucket configuration and a reference to an AWS credentials Secret in the HO namespace.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ticket says the solution should work for both Azure and AWS. This should mention the Azure bits as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to the rest of the proposal where AWS only language is used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enhancement doesn't address the interaction with KMS encryption at rest. When a hosted cluster has KMS encryption configured, the etcd snapshot will contain DEKs wrapped by the KMS key. The snapshot and upload process should work fine since etcdctl snapshot save is encryption-agnostic, but the resulting backup is only restorable if the KMS key remains available and accessible.

A few questions:

  1. Should the EtcdBackupStatus capture metadata about the encryption state (e.g., whether KMS was active, which key ID was used)? This would let the restore path validate key availability before attempting a restore, rather than failing opaquely.
  2. Should there be a note in the Risks and Mitigations table about the dependency between KMS key lifecycle and backup usability?
  3. Even though restore is out of scope here, does the existing RestoreSnapshotURL mechanism already account for KMS key availability, or is that a gap that needs to be tracked separately?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants