OCPBUGS-65896: controller/workload: Rework Degraded/Progressing conditions#2128
OCPBUGS-65896: controller/workload: Rework Degraded/Progressing conditions#2128tchap wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@tchap: This pull request references Jira Issue OCPBUGS-65896, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (ksiddiqu@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/cc @atiratree |
| // Update is done when all pods have been updated to the latest revision | ||
| // and the deployment controller has reported NewReplicaSetAvailable | ||
| workloadIsBeingUpdated := !workloadAtHighestGeneration || !hasDeploymentProgressed(workload.Status) | ||
| workloadIsBeingUpdatedTooLong := v1helpers.IsUpdatingTooLong(previousStatus, *deploymentProgressingCondition.Type) |
There was a problem hiding this comment.
We don't use IsUpdatingTooLong any more, we based the check on ProgressDeadlineExceeded from the deployment controller.
There was a problem hiding this comment.
let's also set .spec.progressDeadlineSeconds to 15m on the consumer workloads to honor the superseded progressingConditionTimeout value
There was a problem hiding this comment.
I don't think we can do that. Sync from the delegate returns the synchronized deployment already. So we would have to send another update request, I guess.
There was a problem hiding this comment.
We can align this in auth when updating library-go. I will actually open a PR there once this is accepted to run CI anyway.
There was a problem hiding this comment.
Yup, I meant that we should update it in the consumer repos like auth.
There was a problem hiding this comment.
Might be better to open a proof PR before the library-go changes are merged.
There was a problem hiding this comment.
| // Update is done when all pods have been updated to the latest revision | ||
| // and the deployment controller has reported NewReplicaSetAvailable | ||
| workloadIsBeingUpdated := !workloadAtHighestGeneration || !hasDeploymentProgressed(workload.Status) | ||
| workloadIsBeingUpdatedTooLong := v1helpers.IsUpdatingTooLong(previousStatus, *deploymentProgressingCondition.Type) |
There was a problem hiding this comment.
let's also set .spec.progressDeadlineSeconds to 15m on the consumer workloads to honor the superseded progressingConditionTimeout value
atiratree
left a comment
There was a problem hiding this comment.
Just one nit, otherwise LGTM, thanks!
| // Update is done when all pods have been updated to the latest revision | ||
| // and the deployment controller has reported NewReplicaSetAvailable | ||
| workloadIsBeingUpdated := !workloadAtHighestGeneration || !hasDeploymentProgressed(workload.Status) | ||
| workloadIsBeingUpdatedTooLong := v1helpers.IsUpdatingTooLong(previousStatus, *deploymentProgressingCondition.Type) |
There was a problem hiding this comment.
Yup, I meant that we should update it in the consumer repos like auth.
| // Update is done when all pods have been updated to the latest revision | ||
| // and the deployment controller has reported NewReplicaSetAvailable | ||
| workloadIsBeingUpdated := !workloadAtHighestGeneration || !hasDeploymentProgressed(workload.Status) | ||
| workloadIsBeingUpdatedTooLong := v1helpers.IsUpdatingTooLong(previousStatus, *deploymentProgressingCondition.Type) |
There was a problem hiding this comment.
Might be better to open a proof PR before the library-go changes are merged.
To enable scaling not to automatically set Progressing/Degraded, conditions handling is aligned so that the Deployment generation is not consulted any more. Degraded only happens when the Deployment is not Available or it times out progressing. Progressing is set to True when the Deployment is not in NewReplicaSetAvailable and it hasn't timed out progressing.
9a05b4b to
0b782a2
Compare
|
@tchap: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: atiratree, tchap The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
To enable scaling not to automatically set Progressing/Degraded,
conditions handling is aligned so that the Deployment generation is not
consulted any more.
Degraded only happens when the deployment is not Available or it times
out progressing. This is less strict than before, but checking available
replicas would mean we go Degraded on scaling automatically.
Progressing is set to True when the deployment is not in
NewReplicaSetAvailable and it hasn't timed out progressing.
I also improved the overall test case names in a separate commit.
This affects
authentication-operatorandopenshift-apiserver-operator.