Conversation
How to use the Graphite Merge QueueAdd the label main-merge-queue to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has required the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
Graphite Automations"Add anton/matt/sergey/kristina as reviwers on operator PRs" took an action on this PR • (01/22/26)2 reviewers were added to this PR based on Anton Bykov's automation. |
6660290 to
295e9ec
Compare
| smbwCluster, err := wekaService.GetSmbwCluster(ctx) | ||
| if err != nil { | ||
| err = errors.Wrap(err, "Failed to get SMB-W cluster") | ||
| return err | ||
| } | ||
|
|
||
| if smbwCluster.Active { | ||
| logger.Info("SMB-W cluster already exists") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
it looks like EnsureSmbwCluster expects smbw cluster to be already created and will otherwise
| func contains(s, substr string) bool { | ||
| return len(s) >= len(substr) && (s == substr || len(s) > 0 && containsIgnoreCase(s, substr)) | ||
| } |
There was a problem hiding this comment.
ah, it's for containsIgnoreCase
then maybe would be good to remove this "contains" func and only keep "containsIgnoreCase"
295e9ec to
5516b89
Compare
dd13517 to
607f4de
Compare
3ecc1ad to
ff27dc2
Compare
ff27dc2 to
4b43eff
Compare
4b43eff to
474d01f
Compare
There was a problem hiding this comment.
Pull request overview
Adds first-class SMB-W (“smbw”) container/cluster support across the operator, including lifecycle reconciliation steps, Weka CLI integrations, and runtime/pod configuration needed for SMB-W.
Changes:
- Extend
WekaService+ CLI implementation with SMB-W cluster operations (create/destroy/join/remove/list + floating IP ranges). - Add SMB-W reconciliation steps in both
WekaCluster(post-create, deletion) andWekaContainer(active join, deletion remove) flows. - Update runtime/pod/helm configuration for SMB-W specifics (mode support in runtime script,
/dev/shmtmpfs sizing, templates/resources).
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/services/weka.go | Adds SMB-W service types and CLI methods for cluster lifecycle + IP ranges. |
| internal/controllers/wekacontainer/metrics_steps.go | Includes SMB-W (and data-services modes) in status metrics polling predicates. |
| internal/controllers/wekacontainer/funcs_active_state_smbw.go | New container-loop helpers for joining/removing SMB-W cluster membership. |
| internal/controllers/wekacontainer/flow_deleting_state.go | Adds deletion step to remove SMB-W containers from the SMB-W cluster; adjusts S3 removal exec selection. |
| internal/controllers/wekacontainer/flow_active_state.go | Adds active-state step to join SMB-W cluster when ready. |
| internal/controllers/wekacluster/steps_post_cluster.go | Adds post-cluster SMB-W steps: destroy/ensure cluster, domain join, configure IP ranges. |
| internal/controllers/wekacluster/steps_deletion.go | Ensures SMB-W containers are paused and absent during cluster finalization. |
| internal/controllers/wekacluster/steps_cluster_creation.go | Includes “smbw” in missing-container build logic (template-driven). |
| internal/controllers/wekacluster/funcs_smbw.go | Implements SMB-W cluster create/destroy, domain join, and floating IP range reconciliation. |
| internal/controllers/wekacluster/funcs_helpers.go | Adds SMB-W container selectors/helpers for cluster loop. |
| internal/controllers/resources/pod.go | Adds /dev/shm tmpfs volume/mount for SMB-W pods; adjusts CPU/memory sizing to account for SMB-W mode. |
| internal/controllers/factory/container_factory.go | Plumbs template knobs into SMB-W container construction (cores/hugepages/extra cores/memory). |
| internal/controllers/allocator/templates.go | Extends templates + dynamic defaults for SMB-W cores/extra cores/hugepages/containers. |
| internal/config/env.go | Adds operator env-config for SMB-W shm sizing + cluster formation constants. |
| charts/weka-operator/values.yaml | Adds helm values for SMB-W shm sizing. |
| charts/weka-operator/templates/manager.yaml | Wires SMBW_SHM_SIZE env var into the manager deployment. |
| charts/weka-operator/resources/weka_runtime.py | Enables “smbw” mode in runtime script paths (creation, resources, shutdown, etc.). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cmd := fmt.Sprintf("weka smb domain join %s %s", username, password) | ||
| _, stderr, err := executor.ExecSensitive(ctx, "JoinSmbwDomain", []string{"bash", "-ce", cmd}) |
There was a problem hiding this comment.
JoinSmbwDomain builds a shell command string containing the domain-join password and executes it via bash -ce. This is vulnerable to shell injection/breakage if the username/password contain spaces or shell metacharacters, and it also increases the risk of accidental secret exposure through process args/telemetry. Prefer executing the CLI with an argv slice (no shell) or otherwise properly escaping/quoting the credentials so they are not interpreted by a shell.
| cmd := fmt.Sprintf("weka smb domain join %s %s", username, password) | |
| _, stderr, err := executor.ExecSensitive(ctx, "JoinSmbwDomain", []string{"bash", "-ce", cmd}) | |
| _, stderr, err := executor.ExecSensitive(ctx, "JoinSmbwDomain", []string{"weka", "smb", "domain", "join", username, password}) |
|
|
||
| containerId := r.container.Status.ClusterContainerID | ||
| if containerId == nil { | ||
| return errors.New("Container ID is not set") |
There was a problem hiding this comment.
RemoveFromSmbwCluster returns an error when ClusterContainerID is nil. During deletion this can happen for SMB-W containers that never joined the cluster, and this step will block finalization unnecessarily. Align behavior with RemoveFromS3Cluster by treating a nil container ID as a no-op (return nil).
| return errors.New("Container ID is not set") | |
| logger.Info("SMB-W container has no ClusterContainerID, skipping removal") | |
| return nil |
474d01f to
df2b9a0
Compare
df2b9a0 to
981e169
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| cmd := []string{ | ||
| "weka", "smb", "cluster", "update", |
There was a problem hiding this comment.
UpdateSmbwCluster uses "weka" while the other SMB-W cluster operations here use "wekaauthcli". This inconsistency is likely to break updates in environments where only the auth-wrapped CLI is available/required. Use the same CLI binary consistently for SMB-W cluster operations (or document/justify why update must use a different binary).
| "weka", "smb", "cluster", "update", | |
| "wekaauthcli", "smb", "cluster", "update", |
| shmSizeLimit := resource.MustParse(config.Config.Smbw.ShmSize) | ||
| vols = append(vols, corev1.Volume{ | ||
| Name: "smbw-shm", | ||
| VolumeSource: corev1.VolumeSource{ | ||
| EmptyDir: &corev1.EmptyDirVolumeSource{ | ||
| Medium: corev1.StorageMediumMemory, | ||
| SizeLimit: &shmSizeLimit, | ||
| }, | ||
| }, | ||
| }) |
There was a problem hiding this comment.
resource.MustParse will panic the controller if SMBW_SHM_SIZE (Helm/environment) is misconfigured. Since this value is user-provided, parse it with error handling and return a reconcile error (and/or surface a condition/event) instead of crashing the manager process.
| shmSizeLimit := resource.MustParse(config.Config.Smbw.ShmSize) | |
| vols = append(vols, corev1.Volume{ | |
| Name: "smbw-shm", | |
| VolumeSource: corev1.VolumeSource{ | |
| EmptyDir: &corev1.EmptyDirVolumeSource{ | |
| Medium: corev1.StorageMediumMemory, | |
| SizeLimit: &shmSizeLimit, | |
| }, | |
| }, | |
| }) | |
| shmSizeLimit, err := resource.ParseQuantity(config.Config.Smbw.ShmSize) | |
| if err == nil { | |
| vols = append(vols, corev1.Volume{ | |
| Name: "smbw-shm", | |
| VolumeSource: corev1.VolumeSource{ | |
| EmptyDir: &corev1.EmptyDirVolumeSource{ | |
| Medium: corev1.StorageMediumMemory, | |
| SizeLimit: &shmSizeLimit, | |
| }, | |
| }, | |
| }) | |
| } |
| } | ||
|
|
||
| // EnsureSmbwClusterUpdated updates the SMB-W cluster configuration for fields that support updates. | ||
| // This includes profile, encryption, and IP pools/ranges. |
There was a problem hiding this comment.
The comment states updates include profile, but SmbwUpdateParams and the populated updateParams do not include it. Either implement profile updates end-to-end (spec → params → CLI flag) or update the comment/docs to match actual behavior to avoid misleading operators.
| // This includes profile, encryption, and IP pools/ranges. | |
| // This includes encryption and IP pools/ranges. |
| ClusterName: clusterName, | ||
| Encryption: r.cluster.Spec.SmbwConfig.Encryption, | ||
| IpPools: r.cluster.Spec.SmbwConfig.IpPools, | ||
| IpRanges: r.cluster.Spec.SmbwConfig.IpRanges, |
There was a problem hiding this comment.
The comment states updates include profile, but SmbwUpdateParams and the populated updateParams do not include it. Either implement profile updates end-to-end (spec → params → CLI flag) or update the comment/docs to match actual behavior to avoid misleading operators.
| IpRanges: r.cluster.Spec.SmbwConfig.IpRanges, | |
| IpRanges: r.cluster.Spec.SmbwConfig.IpRanges, | |
| Profile: r.cluster.Spec.SmbwConfig.Profile, |
| } | ||
|
|
||
| func (r *containerReconcilerLoop) RemoveFromSmbwCluster(ctx context.Context) error { | ||
| ctx, logger, end := instrumentation.GetLogSpan(ctx, "") |
There was a problem hiding this comment.
Passing an empty span/logger name makes tracing and log correlation significantly harder and typically results in unnamed spans. Use a stable operation name (e.g., the function name) so SMB-W removal operations are observable in traces/logs.
| ctx, logger, end := instrumentation.GetLogSpan(ctx, "") | |
| ctx, logger, end := instrumentation.GetLogSpan(ctx, "RemoveFromSmbwCluster") |
| if containerId == nil { | ||
| return errors.New("Container ID is not set") | ||
| } |
There was a problem hiding this comment.
Go error strings should generally be lower-case and not start with a capital letter (so they read naturally when wrapped). Consider changing these to container ID is not set / no active container found (and similarly in other new code paths added in this PR).
| if executeInContainer == nil { | ||
| return errors.New("No active container found") | ||
| } |
There was a problem hiding this comment.
Go error strings should generally be lower-case and not start with a capital letter (so they read naturally when wrapped). Consider changing these to container ID is not set / no active container found (and similarly in other new code paths added in this PR).
| // Don't fail if SMB-W cluster doesn't exist | ||
| if err.Error() == "SMB-W cluster is not configured" { | ||
| logger.Info("SMB-W cluster is not configured, skipping removal", "container_id", *containerId) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Comparing err.Error() to a specific string is brittle and can easily break if the upstream error message changes or if wrapping is introduced. Prefer returning/propagating a typed/sentinel error from RemoveFromSmbwCluster (e.g., var ErrSmbwNotConfigured = ... or a custom error type) and matching it with errors.Is/errors.As.

No description provided.