Merged
Conversation
Reviewer's GuideReverts the recent fail-fast behavior in pod shell command execution, restores executeWithRetry semantics to match execute (returning when either stdout or stderr is available), adds a bash-specific retry helper, and adjusts retry loop and websocket failure handling to allow startup grace periods. Sequence diagram for executeWithRetry and websocket failure handlingsequenceDiagram
actor Caller
participant PodShell
participant PodWebSocketListener
Caller->>PodShell: executeWithRetry(retryCount, retryDelay, commands)
loop attempts 1..retryCount
PodShell->>PodShell: execute(commands)
alt stdout or stderr present
PodShell-->>Caller: PodShellOutput
else WaiterException thrown
PodShell->>PodShell: log warn and continue
end
alt attempt < retryCount
PodShell->>PodShell: Thread.sleep(retryDelay)
end
end
alt no attempt produced output or error and no success
PodShell-->>Caller: throws WaiterException
end
PodWebSocketListener-->>PodShell: onFailure(throwable, response)
PodShell->>PodShell: ignore failure (do not set executionDone)
Class diagram for updated PodShell shell execution behaviorclassDiagram
class PodShell {
+PodShellOutput executeWithBash(String command)
+PodShellOutput executeWithBashRetry(String command)
+PodShellOutput execute(String commands)
+PodShellOutput executeWithRetry(int retryCount, int retryDelay, String commands)
}
class PodWebSocketListener {
+void onOpen()
+void onFailure(Throwable throwable, Response response)
}
PodShell --> PodWebSocketListener : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
onFailure, completely suppressing both the log and theexecutionDonechange may make debugging connection/exec issues harder; consider keeping at least a warning log while still avoiding fail-fast behavior. - In
executeWithRetry, the finalRuntimeExceptionthrown onInterruptedExceptionloses the original cause; consider including theInterruptedExceptionas the cause to aid troubleshooting. - The repeated checks for non-empty stdout/stderr in
executeWithRetrycould be factored into a small helper method to keep the retry loop logic simpler and more readable.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `onFailure`, completely suppressing both the log and the `executionDone` change may make debugging connection/exec issues harder; consider keeping at least a warning log while still avoiding fail-fast behavior.
- In `executeWithRetry`, the final `RuntimeException` thrown on `InterruptedException` loses the original cause; consider including the `InterruptedException` as the cause to aid troubleshooting.
- The repeated checks for non-empty stdout/stderr in `executeWithRetry` could be factored into a small helper method to keep the retry loop logic simpler and more readable.
## Individual Comments
### Comment 1
<location> `core/src/main/java/cz/xtf/core/openshift/PodShell.java:87-96` </location>
<code_context>
} catch (WaiterException e) {
</code_context>
<issue_to_address>
**suggestion:** Final WaiterException loses information about the last underlying failure
With `lastSuccessfulResult` you now throw a generic `WaiterException` whenever no attempt succeeded (`lastSuccessfulResult == null`), losing the details of the last underlying `WaiterException`. Consider keeping a `lastException` and either rethrowing it or setting it as the cause of the final `WaiterException` so callers still see the specific failure reason.
Suggested implementation:
```java
} catch (WaiterException e) {
// If this was the last attempt, rethrow the underlying exception so callers
// can see the specific failure reason instead of a generic WaiterException.
if (attempt >= retryCount) {
throw e;
}
log.warn("Attempt {}/{} failed for command execution on pod {}: {}",
attempt, retryCount, podName, e.getMessage());
}
```
This change assumes:
1. The surrounding method either already declares `throws WaiterException` or handles it at a higher level.
2. `attempt` and `retryCount` are 1-based counters such that `attempt >= retryCount` correctly identifies the final attempt. If they are 0-based or calculated differently, adjust the condition accordingly (e.g. `attempt + 1 >= retryCount`).
If there is still a generic `WaiterException` being thrown elsewhere when `lastSuccessfulResult == null`, you may want to remove or simplify that logic since the last underlying `WaiterException` will now be rethrown on the final attempt.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
7e97a5e to
7a21f99
Compare
Contributor
Author
|
FYI i removed outdated |
7a21f99 to
c8668e7
Compare
mnovak
approved these changes
Feb 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change of behavior was introduced in #634. This PR reverts that change and realigns executeWithRetry with the existing behavior of execute, which returns as soon as either stdout or stderr is present.
In the current implementation, setting
executionDone.set(true)inonFailurecauses fail-fast behavior. This breaks startup scenarios (for example, when a pod is still initializing). Previously, there was a one-minute wait window during which the pod could become ready and execute successfully.Summary by Sourcery
Revert the recent change to pod shell execution retry behavior to again wait for meaningful command output instead of failing fast during startup.
New Features:
Bug Fixes:
Enhancements: