OPRUN-4445: add lifecycle-controller and lifecycle-server#1213
OPRUN-4445: add lifecycle-controller and lifecycle-server#1213joelanford wants to merge 5 commits intoopenshift:mainfrom
Conversation
|
@joelanford: This pull request references OPRUN-4445 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: joelanford The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test all |
There was a problem hiding this comment.
Pull request overview
This PR introduces a lifecycle-controller and lifecycle-server to manage and serve FBC (File-Based Catalog) lifecycle data over HTTP. The controller watches CatalogSources and dynamically creates lifecycle-server deployments per catalog, while the server provides an authenticated HTTP API for accessing lifecycle information.
Changes:
- Added lifecycle-server package that loads and serves FBC lifecycle data via HTTP with authentication/authorization
- Added lifecycle-controller that watches CatalogSources and manages lifecycle-server deployments, services, and network policies
- Integrated TLS configuration dynamically from OpenShift APIServer resource
- Added build targets, container image updates, and comprehensive RBAC manifests
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/lifecycle-server/server.go | HTTP handler for serving lifecycle data with version/package lookup |
| pkg/lifecycle-server/fbc.go | FBC data loader with regex-based schema matching |
| pkg/lifecycle-controller/controller.go | Main controller reconciling CatalogSources and managing lifecycle-server resources |
| cmd/lifecycle-server/main.go, start.go, util.go | CLI for lifecycle-server with TLS, authn/authz, and health endpoints |
| cmd/lifecycle-controller/main.go, start.go, util.go | CLI for lifecycle-controller with leader election and TLS config watching |
| Makefile | Build targets for new lifecycle-controller and lifecycle-server binaries |
| operator-lifecycle-manager.Dockerfile | Container image updates to include new binaries |
| manifests/0000_50_olm_09-lifecycle-server.rbac.yaml | ClusterRole for lifecycle-server (tokenreviews/subjectaccessreviews) |
| manifests/0000_50_olm_08-lifecycle-controller.rbac.yaml | RBAC for lifecycle-controller including cluster-wide resource management |
| manifests/0000_50_olm_08-lifecycle-controller.networkpolicy.yaml | NetworkPolicy restricting lifecycle-controller ingress/egress |
| manifests/0000_50_olm_08-lifecycle-controller.deployment.yaml | Deployment manifest for lifecycle-controller with security constraints |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| catalogDir := "/configs" // default for standard catalog images | ||
| if cs.Spec.GrpcPodConfig != nil && cs.Spec.GrpcPodConfig.ExtractContent != nil && cs.Spec.GrpcPodConfig.ExtractContent.CatalogDir != "" { | ||
| catalogDir = cs.Spec.GrpcPodConfig.ExtractContent.CatalogDir | ||
| } |
There was a problem hiding this comment.
The fbcPath construction concatenates catalogMountPath and catalogDir without ensuring proper path formatting. If catalogDir starts with a slash (e.g., "/configs"), the result would be "/catalog/configs" which is correct, but if it doesn't start with a slash, or if it has a trailing slash, it could lead to malformed paths. Consider using path.Join or ensuring consistent path formatting to avoid potential path construction issues.
| } | |
| } | |
| // Ensure catalogDir has a leading slash so concatenation produces a valid path. | |
| if !strings.HasPrefix(catalogDir, "/") { | |
| catalogDir = "/" + catalogDir | |
| } |
cmd/lifecycle-controller/start.go
Outdated
| func (p *TLSConfigProvider) GetCipherSuites() []string { | ||
| p.mu.RLock() | ||
| defer p.mu.RUnlock() | ||
| return p.config.cipherSuiteStrings |
There was a problem hiding this comment.
The GetCipherSuites method returns a slice that shares the underlying array with the config. If this slice is modified by the caller, it could lead to unexpected behavior or race conditions. Consider returning a defensive copy of the slice to ensure thread safety.
| return p.config.cipherSuiteStrings | |
| if p.config == nil || p.config.cipherSuiteStrings == nil { | |
| return nil | |
| } | |
| ciphers := make([]string, len(p.config.cipherSuiteStrings)) | |
| copy(ciphers, p.config.cipherSuiteStrings) | |
| return ciphers |
| log.Error(err, "failed to delete serviceaccount") | ||
| return err | ||
| } | ||
|
|
There was a problem hiding this comment.
The cleanupResources function does not delete the NetworkPolicy that is created in ensureResources. This will result in orphaned NetworkPolicy resources when a CatalogSource is deleted or no longer matches the selectors. The NetworkPolicy should be deleted along with the Deployment, Service, and ServiceAccount to ensure complete cleanup.
| // Delete NetworkPolicy | |
| np := &networkingv1.NetworkPolicy{ | |
| ObjectMeta: metav1.ObjectMeta{ | |
| Name: name, | |
| Namespace: csNamespace, | |
| }, | |
| } | |
| if err := r.Delete(ctx, np); err != nil && !errors.IsNotFound(err) { | |
| log.Error(err, "failed to delete networkpolicy") | |
| return err | |
| } |
| name = strings.ReplaceAll(name, "_", "-") | ||
| if len(name) > 63 { | ||
| name = name[:63] | ||
| } |
There was a problem hiding this comment.
The resourceName function truncates names to 63 characters but doesn't handle the case where truncation results in a trailing hyphen, which is invalid for Kubernetes resource names. Consider trimming trailing hyphens after truncation to ensure the generated name is always valid.
| } | |
| } | |
| name = strings.TrimRight(name, "-") |
| // LoadLifecycleData loads lifecycle blobs from FBC files at the given path | ||
| func LoadLifecycleData(fbcPath string) (LifecycleIndex, error) { | ||
| result := make(LifecycleIndex) | ||
| var mu sync.Mutex | ||
|
|
||
| // Check if path exists | ||
| if _, err := os.Stat(fbcPath); os.IsNotExist(err) { | ||
| return result, nil | ||
| } | ||
|
|
||
| root := os.DirFS(fbcPath) | ||
| err := declcfg.WalkMetasFS(context.Background(), root, func(path string, meta *declcfg.Meta, err error) error { | ||
| if err != nil { | ||
| return nil // Skip errors, continue walking | ||
| } | ||
| if meta == nil { | ||
| return nil | ||
| } | ||
|
|
||
| // Check if schema matches our pattern | ||
| matches := schemaVersionRegex.FindStringSubmatch(meta.Schema) | ||
| if matches == nil { | ||
| return nil | ||
| } | ||
| schemaVersion := matches[1] // e.g., "v1alpha1" | ||
|
|
||
| if meta.Package == "" { | ||
| return nil | ||
| } | ||
|
|
||
| // Store in index (thread-safe) | ||
| mu.Lock() | ||
| if result[schemaVersion] == nil { | ||
| result[schemaVersion] = make(map[string]json.RawMessage) | ||
| } | ||
| result[schemaVersion][meta.Package] = meta.Blob | ||
| mu.Unlock() | ||
|
|
||
| return nil | ||
| }) | ||
|
|
||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| return result, nil | ||
| } |
There was a problem hiding this comment.
The LoadLifecycleData function has several code paths that should be covered by tests: successful loading, non-existent path, schema matching, and concurrent access. The repository has test coverage for pkg/package-server-manager (e.g., pkg/package-server-manager/controller_test.go), so tests should be added for this new functionality as well.
| // NewHandler creates a new HTTP handler for the lifecycle API | ||
| func NewHandler(data LifecycleIndex, log logr.Logger) http.Handler { | ||
| mux := http.NewServeMux() | ||
|
|
||
| // GET /api/{version}/lifecycles/{package} | ||
| mux.HandleFunc("GET /api/{version}/lifecycles/{package}", func(w http.ResponseWriter, r *http.Request) { | ||
| version := r.PathValue("version") | ||
| pkg := r.PathValue("package") | ||
|
|
||
| // If no lifecycle data is available, return 503 Service Unavailable | ||
| if len(data) == 0 { | ||
| log.V(1).Info("no lifecycle data available, returning 503") | ||
| http.Error(w, "No lifecycle data available", http.StatusServiceUnavailable) | ||
| return | ||
| } | ||
|
|
||
| // Look up version in index | ||
| versionData, ok := data[version] | ||
| if !ok { | ||
| log.V(1).Info("version not found", "version", version, "package", pkg) | ||
| http.NotFound(w, r) | ||
| return | ||
| } | ||
|
|
||
| // Look up package in version | ||
| rawJSON, ok := versionData[pkg] | ||
| if !ok { | ||
| log.V(1).Info("package not found", "version", version, "package", pkg) | ||
| http.NotFound(w, r) | ||
| return | ||
| } | ||
|
|
||
| log.V(1).Info("returning lifecycle data", "version", version, "package", pkg) | ||
|
|
||
| w.Header().Set("Content-Type", "application/json") | ||
| if _, err := w.Write(rawJSON); err != nil { | ||
| log.V(1).Error(err, "failed to write response") | ||
| } | ||
| }) | ||
|
|
||
| return mux | ||
| } |
There was a problem hiding this comment.
The HTTP handler has multiple code paths (no data, version not found, package not found, successful response) that should be covered by tests. The repository has test coverage for pkg/package-server-manager (e.g., pkg/package-server-manager/controller_test.go), so tests should be added for this new functionality as well.
| // resourceName generates a DNS-compatible name for lifecycle-server resources | ||
| func resourceName(csName string) string { | ||
| name := fmt.Sprintf("%s-%s", csName, resourceBaseName) | ||
| name = strings.ReplaceAll(name, ".", "-") | ||
| name = strings.ReplaceAll(name, "_", "-") | ||
| if len(name) > 63 { | ||
| name = name[:63] | ||
| } | ||
| return strings.ToLower(name) | ||
| } |
There was a problem hiding this comment.
The resourceName function has several edge cases (names with dots, underscores, long names, truncation) that should be covered by tests to ensure it always generates valid Kubernetes resource names. The repository has test coverage for pkg/package-server-manager (e.g., pkg/package-server-manager/controller_test.go), so tests should be added for this new functionality as well.
| - name: GOMEMLIMIT | ||
| value: "5MiB" | ||
| resources: | ||
| requests: | ||
| cpu: 10m | ||
| memory: 10Mi |
There was a problem hiding this comment.
The GOMEMLIMIT is set to "5MiB" but the memory request is set to "10Mi". These should be aligned for optimal performance. Typically, GOMEMLIMIT should be set to 90% of the memory limit, or if no limit is set, slightly below the memory request to account for non-Go memory usage. Consider either increasing GOMEMLIMIT closer to 10Mi or adjusting the memory request.
|
/test all |
1 similar comment
|
/test all |
|
@joelanford: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Add openshift/controller-runtime-common for shared TLS controller support, and update openshift/library-go crypto package. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add the lifecycle-server component that serves File-Based Catalog content over HTTPS. The server watches for catalog updates and serves them to OLM for operator discovery. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add the lifecycle-controller that reconciles CatalogSource resources to manage lifecycle-server pods. Includes dynamic TLS configuration via controller-runtime-common, NetworkPolicy management, leader election, and server-side apply for owned resources. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add deployment manifests for lifecycle-controller and lifecycle-server including RBAC, NetworkPolicy, Service, and IBM Cloud managed variants. Update Dockerfile with lifecycle binary targets, Makefile with build rules, CRD manifests with lifecycle annotations, and the manifest generation script for downstream handling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive unit tests covering controller reconciliation, TLS configuration, FBC catalog building, and HTTP server behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5010575 to
e3064e7
Compare
Add controller that watches CatalogSources and manages lifecycle-server deployments. The server serves FBC catalog content over HTTP.
Includes build targets, container image updates, and RBAC manifests.