Skip to content

Auto-finalize ResponseStream on iteration completion#4478

Open
giles17 wants to merge 5 commits intomicrosoft:mainfrom
giles17:multi-turn-samples
Open

Auto-finalize ResponseStream on iteration completion#4478
giles17 wants to merge 5 commits intomicrosoft:mainfrom
giles17:multi-turn-samples

Conversation

@giles17
Copy link
Contributor

@giles17 giles17 commented Mar 4, 2026

Summary

Auto-finalize ResponseStream when async iteration completes, so that result hooks (including session history persistence) run without requiring an explicit get_final_response() call.

Previously, streaming multi-turn conversations would lose context between turns because the _post_hook result hook — which saves messages to the session via InMemoryHistoryProvider — only ran inside get_final_response(). Users who simply iterated the stream with async for had no history persisted.

Changes

_types.pyResponseStream

  • __anext__: Call get_final_response() on StopAsyncIteration (after cleanup hooks) to auto-trigger finalization
  • get_final_response(): Re-check _finalized after iteration to avoid double-finalization when __anext__ already finalized; guard inner stream finalization with _finalized check for wrapped/mapped streams

test_types.py

  • test_auto_finalize_on_iteration_completion — stream is finalized after async for
  • test_auto_finalize_runs_result_hooks — result hooks run without explicit get_final_response()
  • test_get_final_response_idempotent_after_auto_finalize — finalizer runs only once

test_agents.py

  • test_chat_client_agent_streaming_session_history_saved_without_get_final_response — session history is persisted after streaming iteration without get_final_response()

Closes #4447

- Rename 03_multi_turn.py to 03a_multi_turn.py
- Add 03b_multi_turn_streaming.py showing streaming with session history
- The new sample demonstrates calling get_final_response() after
  iterating the stream to persist conversation history
- Update READMEs to reflect the new file names

Closes microsoft#4447

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 4, 2026 19:54
@markwallace-microsoft markwallace-microsoft added documentation Improvements or additions to documentation python labels Mar 4, 2026
@github-actions github-actions bot changed the title Add multi-turn streaming sample (03b) Python: Add multi-turn streaming sample (03b) Mar 4, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new getting-started sample that demonstrates multi-turn + streaming with session persistence, and renames/reframes the existing multi-turn sample as 03a to clarify the non-streaming vs streaming patterns.

Changes:

  • Renames the multi-turn getting-started sample to 03a_multi_turn.py (session-based multi-turn, non-streaming).
  • Adds 03b_multi_turn_streaming.py demonstrating streaming across turns and the need to call get_final_response() to persist history.
  • Updates sample READMEs to reference the new filenames.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.

File Description
python/samples/README.md Updates top-level sample index to point at 03a/03b.
python/samples/01-get-started/README.md Updates getting-started table to include 03a and new 03b.
python/samples/01-get-started/03a_multi_turn.py New/renamed non-streaming multi-turn session sample.
python/samples/01-get-started/03b_multi_turn_streaming.py New multi-turn streaming sample showing get_final_response() for session persistence.
Comments suppressed due to low confidence (3)

python/samples/README.md:22

  • 3b. won’t be parsed as part of the ordered list in Markdown (ordered list markers must be numeric), so this line will render as plain text and break list numbering/formatting. Consider switching this section to a table (like 01-get-started/README.md) or using a normal numeric list item (e.g., 4.) while keeping the link text/file name as 03b_multi_turn_streaming.py.
3. **[03a_multi_turn.py](./01-get-started/03a_multi_turn.py)** — Multi-turn conversations with `AgentSession`
3b. **[03b_multi_turn_streaming.py](./01-get-started/03b_multi_turn_streaming.py)** — Multi-turn streaming conversations

python/samples/01-get-started/03b_multi_turn_streaming.py:61

  • All other 01-get-started samples include XML-style snippet tags (e.g., <create_agent>...</create_agent>) for docs extraction, but this new sample doesn’t. Please add snippet tags around the client/agent setup and the multi-turn streaming example blocks so it stays consistent with the get-started docs tooling.
async def main() -> None:
    # 1. Create the client and agent.
    credential = AzureCliCredential()
    client = AzureOpenAIResponsesClient(
        project_endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
        deployment_name=os.environ["AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME"],
        credential=credential,
    )

    agent = client.as_agent(
        name="ConversationAgent",
        instructions="You are a friendly assistant. Keep your answers brief.",
    )

    # 2. Create a session to maintain conversation history.
    session = agent.create_session()

    # 3. First turn — stream the response, then finalize to save history.
    print("Agent: ", end="")
    stream = agent.run("My name is Alice and I love hiking.", session=session, stream=True)
    async for chunk in stream:
        if chunk.text:
            print(chunk.text, end="", flush=True)
    await stream.get_final_response()  # Persists messages to the session
    print("\n")

    # 4. Second turn — the agent remembers context from the first turn.
    print("Agent: ", end="")
    stream = agent.run("What do you remember about me?", session=session, stream=True)
    async for chunk in stream:
        if chunk.text:
            print(chunk.text, end="", flush=True)
    await stream.get_final_response()

python/samples/01-get-started/03b_multi_turn_streaming.py:57

  • For streaming output, consider adding flush=True to the print("Agent: ", end="") calls so the prefix is displayed immediately before the first streamed chunk arrives (this matches the pattern used in other streaming samples in this folder).
    print("Agent: ", end="")
    stream = agent.run("My name is Alice and I love hiking.", session=session, stream=True)
    async for chunk in stream:
        if chunk.text:
            print(chunk.text, end="", flush=True)
    await stream.get_final_response()  # Persists messages to the session
    print("\n")

    # 4. Second turn — the agent remembers context from the first turn.
    print("Agent: ", end="")
    stream = agent.run("What do you remember about me?", session=session, stream=True)

You can also share your feedback on Copilot code review. Take the survey.

When a ResponseStream is fully consumed via async iteration,
automatically trigger finalization (finalizer + result hooks).
This ensures session history is persisted in streaming multi-turn
conversations without requiring an explicit get_final_response() call.

- Add auto-finalize call in __anext__ on StopAsyncIteration
- Guard inner stream finalization to prevent double-execution
- Re-check _finalized after iteration in get_final_response()
- Add tests for auto-finalization and streaming session history
- Revert sample file renames from previous commit

Closes microsoft#4447

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Mar 4, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _types.py10489690%59, 68–69, 123, 128, 147, 149, 153, 157, 159, 161, 163, 181, 185, 211, 233, 238, 243, 247, 277, 657–658, 1166, 1237, 1254, 1272, 1295, 1305, 1322–1323, 1325, 1343–1344, 1346, 1353–1354, 1356, 1391, 1402–1403, 1405, 1443, 1670, 1722, 1813–1818, 1840, 1845, 2011, 2023, 2275, 2296, 2391, 2620, 2827, 2898, 2909–2912, 2914, 2916–2923, 2932, 3130–3132, 3135–3137, 3141, 3146, 3150, 3234–3236, 3265, 3319, 3338–3339, 3342–3346, 3352
TOTAL22607280687% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
4700 25 💤 0 ❌ 0 🔥 1m 19s ⏱️

@giles17 giles17 changed the title Python: Add multi-turn streaming sample (03b) Auto-finalize ResponseStream on iteration completion Mar 4, 2026
giles17 and others added 2 commits March 4, 2026 14:58
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Multi-Turn example can't be reproduced

4 participants