Skip to content

Fix WhenAllTask crash when children complete after fail-fast#123

Open
YunchuWang wants to merge 1 commit intomainfrom
copilot-finds/bug/fix-whenall-child-completion-after-fail-fast
Open

Fix WhenAllTask crash when children complete after fail-fast#123
YunchuWang wants to merge 1 commit intomainfrom
copilot-finds/bug/fix-whenall-child-completion-after-fail-fast

Conversation

@YunchuWang
Copy link
Member

@YunchuWang YunchuWang commented Mar 3, 2026

Fix #120
WhenAllTask.onChildCompleted() had two bugs:

  1. Threw 'Task is already completed' when a child completed after the WhenAllTask had already failed via fail-fast. This crashed orchestrations when multiple activities were in a WhenAll and one failed while others completed in the same or subsequent event batch.

  2. Fell through from the fail-fast block to the result-collection block when the failing task was the last child to complete, causing getResult() to throw on the failed task.

Fixes:

  • Change throw to return in onChildCompleted when already complete
  • Add return after fail-fast to prevent fall-through to getResult()
  • Add _isComplete guard in RuntimeOrchestrationContext.resume() to prevent attempting to resume a finished generator

Summary

What changed?

Why is this change needed?

Issues / work items

  • Resolves #
  • Related #

Project checklist

  • Release notes are not required for the next release
    • Otherwise: Notes added to CHANGELOG.md
  • Backport is not required
    • Otherwise: Backport tracked by issue/PR #issue_or_pr
  • All required tests have been added/updated (unit tests, E2E tests)
  • Breaking change?
    • If yes:
      • Impact:
      • Migration guidance:

AI-assisted code disclosure (required)

Was an AI tool used? (select one)

  • No
  • Yes, AI helped write parts of this PR (e.g., GitHub Copilot)
  • Yes, an AI agent generated most of this PR

If AI was used:

  • Tool(s):
  • AI-assisted areas/files:
  • What you changed after AI output:

AI verification (required if AI was used):

  • I understand the code and can explain it
  • I verified referenced APIs/types exist and are correct
  • I reviewed edge cases/failure paths (timeouts, retries, cancellation, exceptions)
  • I reviewed concurrency/async behavior
  • I checked for unintended breaking or behavior changes

Testing

Automated tests

  • Result: Passed / Failed (link logs if failed)

Manual validation (only if runtime/behavior changed)

  • Environment (OS, Node.js version, components):
  • Steps + observed results:
    1.
    2.
    3.
  • Evidence (optional):

Notes for reviewers

  • N/A

WhenAllTask.onChildCompleted() had two bugs:

1. Threw 'Task is already completed' when a child completed after the
   WhenAllTask had already failed via fail-fast. This crashed
   orchestrations when multiple activities were in a WhenAll and one
   failed while others completed in the same or subsequent event batch.

2. Fell through from the fail-fast block to the result-collection block
   when the failing task was the last child to complete, causing
   getResult() to throw on the failed task.

Fixes:
- Change throw to return in onChildCompleted when already complete
- Add return after fail-fast to prevent fall-through to getResult()
- Add _isComplete guard in RuntimeOrchestrationContext.resume() to
  prevent attempting to resume a finished generator

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 20:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a crash in the core task-composition logic (whenAll) by making WhenAllTask resilient to child completions that arrive after the composite has already fail-fast completed, and by preventing the runtime from trying to resume an already-finished orchestrator generator.

Changes:

  • Update WhenAllTask.onChildCompleted() to ignore completions after completion, and to return immediately after fail-fast to prevent fall-through into result collection.
  • Add an _isComplete guard in RuntimeOrchestrationContext.resume() to avoid resuming finished orchestrations.
  • Add regression tests covering fail-fast + late completions and the “last completion is the failing one” scenario.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
packages/durabletask-js/src/task/when-all-task.ts Prevents fail-fast whenAll from throwing on later child completions and avoids fall-through into getResult() after a failure.
packages/durabletask-js/src/worker/runtime-orchestration-context.ts Prevents attempts to resume a generator after orchestration completion.
packages/durabletask-js/test/orchestration_executor.spec.ts Adds targeted regression coverage for whenAll fail-fast edge cases and caught-failure behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[copilot-finds] Bug: WhenAllTask crashes when children complete after fail-fast

2 participants