Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @viragvoros! |
|
Thanks @viragvoros 🙂 |
|
@Kumm-Kai So an entirely new retry period variable would be better maybe? |
I would say that both approaches are valid, but at least from our POV, as we are only interested in the |
thiyyakat
left a comment
There was a problem hiding this comment.
Hi! Thanks for the PR. Just a few changes to maintain consistency with existing code. PTAL.
| TargetKubeconfig string | ||
| ControlKubeconfig string | ||
| TargetKubeconfig string | ||
| ResourceExhaustedRetry string |
There was a problem hiding this comment.
This field can be added to MachineControllerManagerConfiguration in pkg/options/types.go instead.
There was a problem hiding this comment.
Hi @thiyyakat, thank you for the review :) I added the changes. In this comment, did you mean the pkg/util/provider/options/types.go MachineControllerConfiguration? The variable was intended to be introduced only in the MachineController not in the outer main.
There was a problem hiding this comment.
Hi! Yes, you're absolutely right. Apologies.
| fs.StringVar(&s.NodeConditions, "node-conditions", s.NodeConditions, "List of comma-separated/case-sensitive node-conditions which when set to True will change machine to a failed state after MachineHealthTimeout duration. It may further be replaced with a new machine if the machine is backed by a machine-set object.") | ||
| fs.StringVar(&s.BootstrapTokenAuthExtraGroups, "bootstrap-token-auth-extra-groups", s.BootstrapTokenAuthExtraGroups, "Comma-separated list of groups to set bootstrap token's \"auth-extra-groups\" field to") | ||
|
|
||
| fs.StringVar(&s.ResourceExhaustedRetry, "resource-exhausted-retry", "", "Retry duration used when machine creation fails with ResourceExhausted. Defaults to LongRetry.") |
There was a problem hiding this comment.
This can be of type fs.DurationVar like so:
fs.DurationVar(&s.ResourceExhaustedRetryDuration.Duration, "resource-exhausted-retry", time.Duration(machineutils.LongRetry), "Retry duration used when machine creation fails with ResourceExhausted. Defaults to LongRetry.")
This way the default is set to machineutils.LongRetry here itself.
Please also add validation for this option below in the Validate() method.
if s.ResourceExhaustedRetryDuration.Duration < 0 { errs = append(errs, fmt.Errorf("resource exhausted retry duration should be a non negative value: got: %v", s.ResourceExhaustedRetryDuration.Duration)) }
This PR makes the retry period used when machine creation fails with a
codes.ResourceExhaustederror configurable.PR #981 introduced the
LongRetryretry period (10 minutes) to handle situations where machine creation fails due to exhausted resources. However, in certain environments this retry duration may still be insufficient.To improve flexibility, this PR allows operators to adjust the retry period used specifically for the
ResourceExhaustederror case. The default behavior stays unchanged: if no value is provided as a a command-line flag (--resource-exhausted-retry) to the machine-controller-manager-provider, the controller continues to usemachineutils.LongRetry.Which issue(s) this PR fixes:
Fixes #977
Extends logic introduced in PR #981 to make the retry duration configurable.
Special notes for your reviewer:
The implementation keeps the existing default behavior unchanged:
ResourceExhaustedretry period defaults tomachineutils.LongRetry.--resource-exhausted-retryCLI flag in machine-controller-manager-provider.No changes are required for existing deployments unless operators want to configure the retry duration.
Release note:
@Kumm-Kai @hasit97