Skip to content

Comments

OPRUN-4099: OLMv1 Deployment Configuration API#1915

Open
oceanc80 wants to merge 5 commits intoopenshift:masterfrom
oceanc80:olmv1-subscription-config
Open

OPRUN-4099: OLMv1 Deployment Configuration API#1915
oceanc80 wants to merge 5 commits intoopenshift:masterfrom
oceanc80:olmv1-subscription-config

Conversation

@oceanc80
Copy link

@oceanc80 oceanc80 commented Jan 2, 2026

Enhancement extending OLMv1's ClusterExtension API to support deployment configuration in order to provide feature parity with OLMv0's SubscriptionConfig.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 2, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 2, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 2, 2026

@oceanc80: This pull request references OPRUN-4099 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Enhancement extending OLMv1's ClusterExtension API to support deployment configuration in order to provide feature parity with OLMv0's SubscriptionConfig.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 2, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

- As a cluster extension admin, I want to attach custom storage volumes to operator pods, so that I can provide persistent storage or configuration files to operators.
- As a cluster extension admin, I want to configure pod affinity rules for operator deployments, so that I can control how operator pods are distributed across cluster nodes.
- As a cluster extension admin, I want to add custom annotations to operator deployments, so that I can integrate with monitoring and observability tools.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what the story for the Selector is. I wonder if its to handle changes in the pod label selector in the operator's controller deployment between versions (the label selector in the deployment spec is immutable). This configuration could provide upgrade continuity across this type of breaking change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also see it being used for blue/green deployments or other similar deployment strategies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The selector of a deployment is immutable, iirc. I dove into the history of this field, and it looks like it was basically there from the beginning with no real explanation that I could find, and it has never been honored as far as I can tell.

Chalk it up to how fast and loose the early days of OLM were.

@kuiwang02
Copy link

@oceanc80 I know the PR is in WIP, and I also am not sure if the following comment is the scope of this EP.

If the following comment is not the scope of this EP or it is not correct time to raise the comments, you could ignore the following the comment:


The deploymentConfig API design looks well thought out. I noticed that the proposal currently focuses on initial installation scenarios, and I was wondering if you could clarify the behavior for runtime configuration updates, which I expect will be a common operational workflow.

Question 1: Modifying Existing deploymentConfig Values

Scenario: After creating a ClusterExtension with deploymentConfig, a user wants to update some values (e.g., changing memory limits from 256Mi to 512Mi, or adding a new nodeSelector).

Could you clarify:

  1. Will users be able to modify spec.config.inline.deploymentConfig values after ClusterExtension creation?
  2. If supported, will the changes automatically reconcile and apply to the existing Deployment?
  3. Which configuration field changes will trigger a pod rolling update?
  4. Are there any fields that don't support runtime updates?

Example:

  # Initial configuration
  deploymentConfig:
    resources:
      limits:
        memory: "256Mi"

  # User updates to:
  deploymentConfig:
    resources:
      limits:
        memory: "512Mi"      # ← modified
    nodeSelector:            # ← added
      infrastructure: "dedicated"

Question 2: Adding deploymentConfig After Creation

Scenario: A user creates a ClusterExtension without defining deploymentConfig initially, then later wants to add deployment configuration.

Could you clarify:

  1. Is it supported to add deploymentConfig to an existing ClusterExtension that was created without it?
  2. If supported, will the changes automatically reconcile and apply to the existing Deployment?
  3. Which new configuration field will trigger a pod rolling update?
  4. Are there any fields that don't support runtime added?
  5. Similarly, what happens if a user removes deploymentConfig entirely?

Example:

  # Initial creation (no deploymentConfig)
  apiVersion: olm.operatorframework.io/v1
  kind: ClusterExtension
  metadata:
    name: my-operator
  spec:
    source:
      sourceType: Catalog
      catalog:
        packageName: my-operator
    # Note: no config.inline.deploymentConfig

  ---

  # Later, user adds deploymentConfig
  spec:
    config:
      inline:
        deploymentConfig:    # ← newly added
          nodeSelector:
            infrastructure: "dedicated"

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 28, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jmguzik for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@perdasilva
Copy link

@kuiwang02 let me try to reply to your questions

Question 1: Modifying Existing deploymentConfig Values

  1. Will users be able to modify spec.config.inline.deploymentConfig values after ClusterExtension creation?

From the perspective of OLMv1, bundle configuration is opaque. It will take user input, validated it against the configuration schema provided by the bundle, and apply it to generate the final manifests. So, any configuration can
be changed at runtime. This does mean that some user configurations might generate manifests that cannot be applied, or could lead to unintended or bad consequences. If there are errors, the extension will be in a broken state until the configuration is fixed.

  1. If supported, will the changes automatically reconcile and apply to the existing Deployment?

Yes. The Deployment will be regenerated with the new values and applied to the cluster.

  1. Which configuration field changes will trigger a pod rolling update?

I'd say so, yes. Any changes to the pod template should trigger a new replicaset and the deployment will transition towards that.

  1. Are there any fields that don't support runtime updates?

This is a good question. I know there are fields in the Deployment spec that are immutable (e.g. the label selector). That the only one I can think of. I think the configuration options under the deployment config are mutable.

Question 2: Adding deploymentConfig After Creation

  1. Is it supported to add deploymentConfig to an existing ClusterExtension that was created without it?

Yes. For the same reasons in Q1.1

  1. If supported, will the changes automatically reconcile and apply to the existing Deployment?

Yes.

  1. Which new configuration field will trigger a pod rolling update?

Yes.

  1. Are there any fields that don't support runtime added?

I don't think so.

  1. Similarly, what happens if a user removes deploymentConfig entirely?

Then we are back to the Deployment spec defined in the bundle by the author.

The mental model here is really no different than:

  1. Create a Deployment
  2. Modify the deployment

AFAIK only the pod label selector is immutable once set.

@kuiwang02
Copy link

immutable

@perdasilva Thanks for your great reply. I got it.

@oceanc80 oceanc80 marked this pull request as ready for review February 17, 2026 18:07
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 17, 2026
@anik120
Copy link
Contributor

anik120 commented Feb 17, 2026

@JoelSpeed PTAL, thanks!

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 17, 2026

@oceanc80: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint fd68cdf link true /test markdownlint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


## Proposal

This proposal extends the existing ClusterExtension inline configuration structure to support deployment customization by using [OLMv0's](https://github.com/operator-framework/api/blob/master/pkg/operators/v1alpha1/subscription_types.go#L42-L100) `v1alpha1.SubscriptionConfig`. For clarity in the OLMv1 codebase, a type alias, `DeploymentConfig`, will be introduced, since Subscription is a v0 concept only. OLMv1 will directly use the `SubscriptionConfig` type from operator-framework/api to define the `DeploymentConfig` type. This ensures feature parity with OLMv0 by using the same type definition and reduces our maintenance overhead while we navigate a period of simultaneous maintenance of both OLMv0 and OLMv1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing types between APIs like this is generally discouraged. Someone may make a change to the SubscriptionConfig that you didn't realise and then you end up either supporting (or breaking) functionality inadvertently in the DeploymentConfig usage.

Is the code shared between v0 and v1 for this part of the deployment strategy, or are you just trying to make sure you support the same set of customizations? Is there anything in the alpha API that you might want to make better from lessons learned as you promote this to v1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed we had this discussion upstream where I'd proposed a new, completely separate structure for v1, but it was vetoed in favor of keeping v0 and v1 in sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this enhancement is to provide a feature that is needed for the migration pathway from OLMv0-installed operators to OLMv1 ClusterExtensions. We explicitly need all subscription.spec.config fields and behaviors to be supported in OLMv1 to ensure migration from OLMv0 to OLMv1 is not blocked on a diff between the configuration APIs.

The idea here is that a migration tool would be able to take the subscription.spec.config and copy it into the clusterextension.spec.config.inline.deploymentConfig.

So one way or another the OLMv1 bundle-schema for deploymentConfig needs to include everything that OLMv0 includes for equivalent OCP versions.


#### Validation Failure

If the deployment configuration fails JSON schema validation:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by json schema validation here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +87 to +90
1. The ClusterExtension controller rejects the configuration during admission or runtime validation
2. The ClusterExtension status is updated with a detailed error message indicating which fields failed validation
3. The cluster extension admin corrects the configuration based on the error message
4. The controller retries the installation with the corrected configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you reject invalid configuration at admission time instead of leaving a controller to check this asynchronously?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Joel, this enhancement is related to this previous one, where all of this was discussed. The TL;DR is, OLMv1 aims to get out of the bundle configuration game, therefore there won't but a one-size-fits-all configuration schema to validate against at ingress time. We want to shift the configuration surface definition to the authors and allow them to provide content-specific knobs. OLMv1 will only validate configuration against a bundle provided schema.


### API Extensions

The enhancement does not introduce new APIs, CRDs, webhooks, or aggregated API servers. As the inline configuration structure in the ClusterExtension API accepts any valid JSON object, the API will not be changed. This enhancement modifies the existing configuration schema for registry+v1 bundles to accept a `deploymentConfig` field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does? You're adding a new field with significant structure to the ClusterExtension API as far as I can tell from reading this?

Even if this is a new structure within something that is effectively unstructured data, you are still introducing a new API contract, this should be explained here and reviewed

Can you link to the existing schema and show where it is updated to allow this deploymentConfig field?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CRD doesn't change. inline remains as is:

https://github.com/operator-framework/operator-controller/blob/1ef820f0ca56126586fca2dc7a422c71edd7deef/api/v1/clusterextension_types.go#L193

There are two lenses to this (and I feel like a lot of this was already discussed in the original EP):

  1. OLMv1 conceptually claims no control over schemas provided by bundles. Bundles define the schemas of their APIs, OLM validates the provided configurations using the bundle-provided schema.
  2. Technically, the registry+v1 schema definition is driven by two things:
    • The install modes of the operator (this dictates the schema of the watch namespace configuration)
    • OLMv0's subscription.spec.config

So this EP is not technically a new contract. It is establishing the OLMv1 implementation of an existing contract (OLMv0's subscription.spec.config)

Follow-up question: Is OCP okay with every single layered product being able to define their own configuration schema in the future? If so, will every layered product be expected to bring their configuration schema to OpenShift's API review? This API change is essentially the equivalent that.

If that seems untenable, perhaps we should drop the configuration API from OLMv1 entirely and require that layered products have a one-size-fits-all deployment (similar to core payload operators). But that would have major implications on migration from OLMv0 to OLMv1 that PM would have to weigh in on.

### API Extensions

The enhancement does not introduce new APIs, CRDs, webhooks, or aggregated API servers. As the inline configuration structure in the ClusterExtension API accepts any valid JSON object, the API will not be changed. This enhancement modifies the existing configuration schema for registry+v1 bundles to accept a `deploymentConfig` field.
The configuration is validated using JSON schema generated from Kubernetes core v1 and apps v1 OpenAPI specifications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These generated schemas are not particularly reliable sources of information, why can you not validate using the internal validation packages within these APIs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are reliable sources of information in the sense that they're generated from the source itself. ref: operator-framework/operator-controller#2454

I'm not sure what you mean by internal validation packages. I'm guessing the PR makes things more clear?

}
```

The `Selector` field in the `SubscriptionConfig` is present but is not ever extracted or used by OLMv0. OLMv1 will maintain this behavior so the field will be accepted but ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepting but ignoring a field is bad practice. Why not create a new type for the deployment config? It doesn't look like it'll be particularly complex to implement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same answer as above:

we had this discussion upstream where I'd proposed a new, completely separate structure for v1, but it was vetoed in favor of keeping v0 and v1 in sync.

It was discussed that reusing the v0 structure would mean carrying over debts, but the cost of it was assessed to be acceptable for long term maintainability

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @JoelSpeed 's point here. Even if we re-use the type, we are also in control of the schema generation for that type, right? So at a minimum we could specifically exclude that field from the generated schema.

The inline configuration will be validated using JSON schema-based validation. The JSON schema for `DeploymentConfig` will be generated by introspecting the `v1alpha1.SubscriptionConfig` struct. This approach:

- **Ensures parity with OLMv0**: The schema is derived from the exact same type definition used in OLMv0
- **Automatic updates**: When the github.com/operator-framework/api dependency is updated, the schema generation can be re-run to incorporate any new fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know that the schema generated by your v1 tooling is the same as the v0 tooling? What if there's drift between the generators?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nature of the generators in v0 and v1 are completely different. In v0 they were part of the API surface, in v1 they are not. ie different validation structure in v1 and v0, but both are tied together by the underlying SubscritionConfig structure.

And the v1 generator uses SubscriptionConfig struct to generate the schema.

In other words, this is why it was decided to use the v0 SubscripitonConfig directly, so that any drift in v0, automatically gets tracked in v1.

Note that the schema is not expected to change in v1 (thereby requiring a tracking back of change to v0), as this effort is solely to support old registry+v1 bundles in v1 so that they can be migrated over.

curl -sSL https://raw.githubusercontent.com/kubernetes/kubernetes/refs/tags/$(OPENAPI_VERSION)/api/openapi-spec/v3/apis__apps__v1_openapi.json > ./pkg/schema/apis__apps__v1_openapi.json
```

This ensures that while the first iteration uses a static snapshot, there is an established, low-effort path to update the schema manifests whenever the project's Kubernetes dependencies are bumped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you have checks in place so that people keep this up to date?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes upstream runs verify every time and this check has been included in that.

## Open Questions / Considerations

### Track changes to underlying kubernetes corev1 structures?
SubscriptionConfig uses many kubernetes corev1 structures from the standard kube lib. This means that the OLMv0 Subscription API would track changes to those structures (e.g. if a new Volume type is added to the API etc.). We need to think about whether we want the same behavior here, and if so how we'd like to implement it. E.g. we could have some process downloading and mining the openapi specs for the given kube lib version we have in go.mod, and having make verify fail when that changes. We'd want to think about how we'd handle any CEL expressions in those corev1 structures when doing the validation (and whether we want to handle them?).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you doing any processing of these fields, or just setting them directly on the deployment that you're rendering and applying? If you aren't processing them and are just passing them through, then this is probably fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants