Skip to content

Bug: Dual content emission causes frontend truncation — content_block:end + token yield deliver same text twice #225

@michaeljabbour

Description

@michaeljabbour

Summary

The loop-streaming orchestrator emits the same response text through two redundant pipelines, causing downstream consumers that read both channels to accumulate duplicated content.

Reproduction

When extended thinking with interleaved content is enabled (interleaved-thinking-2025-05-14), the orchestrator:

  1. Phase 1 (events): Iterates content blocks and emits content_block:start + content_block:end with full block text
  2. Phase 2 (yield): Re-yields the same response.text token-by-token as content_block:delta events

Frontend consumers that handle both content_block:end (writing full text to message.content) and delta events (appending tokens to message.content) end up with message.content longer than the actual response. In the observed case: 2,009 chars of actual content inflated to 2,121 chars in message.content.

Impact

The Kepler desktop frontend has a liveTextTail guard in ChatMessage.tsx that detects the mismatch between contentParts text and message.content length, then creates a phantom tail fragment. The user sees only the last ~112 characters of a full response (starting mid-word), with the rest hidden behind tool call timeline entries.

Observed: User asked for dashboard KPI recommendations. Full response was 2,009 chars with tables and analysis. User saw only: "d I'll start there. Also happy to hear if you have other specific KPIs or filters in mind that I haven't listed."

Expected Behavior

Each content block should be delivered through one channel, not both:

  • Streaming mode: Yield tokens (with content_block:delta for observers), then emit content_block:end as a finalization signal (with metadata, not redundant content delivery)
  • Non-streaming fallback: Emit content_block:start + content_block:end with full content

Architectural Note

Per the amplifier-core design philosophy ("Explicit > implicit", "No hidden state"), consumers should not need to deduplicate content from the orchestrator. The orchestrator owns delivery policy and should own delivery correctness.

Workaround

Kepler desktop is applying a temporary frontend guard: only create liveTextTail during active streaming (message.isStreaming === true), not for finalized messages.

Environment

  • amplifier-module-loop-streaming (latest via git)
  • amplifier-module-provider-anthropic with interleaved thinking enabled
  • Kepler desktop sidecar (amplifier-distro-kepler)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions