Auto-finalize ResponseStream on iteration completion#4478
Open
giles17 wants to merge 5 commits intomicrosoft:mainfrom
Open
Auto-finalize ResponseStream on iteration completion#4478giles17 wants to merge 5 commits intomicrosoft:mainfrom
giles17 wants to merge 5 commits intomicrosoft:mainfrom
Conversation
- Rename 03_multi_turn.py to 03a_multi_turn.py - Add 03b_multi_turn_streaming.py showing streaming with session history - The new sample demonstrates calling get_final_response() after iterating the stream to persist conversation history - Update READMEs to reflect the new file names Closes microsoft#4447 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new getting-started sample that demonstrates multi-turn + streaming with session persistence, and renames/reframes the existing multi-turn sample as 03a to clarify the non-streaming vs streaming patterns.
Changes:
- Renames the multi-turn getting-started sample to
03a_multi_turn.py(session-based multi-turn, non-streaming). - Adds
03b_multi_turn_streaming.pydemonstrating streaming across turns and the need to callget_final_response()to persist history. - Updates sample READMEs to reference the new filenames.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| python/samples/README.md | Updates top-level sample index to point at 03a/03b. |
| python/samples/01-get-started/README.md | Updates getting-started table to include 03a and new 03b. |
| python/samples/01-get-started/03a_multi_turn.py | New/renamed non-streaming multi-turn session sample. |
| python/samples/01-get-started/03b_multi_turn_streaming.py | New multi-turn streaming sample showing get_final_response() for session persistence. |
Comments suppressed due to low confidence (3)
python/samples/README.md:22
3b.won’t be parsed as part of the ordered list in Markdown (ordered list markers must be numeric), so this line will render as plain text and break list numbering/formatting. Consider switching this section to a table (like01-get-started/README.md) or using a normal numeric list item (e.g.,4.) while keeping the link text/file name as03b_multi_turn_streaming.py.
3. **[03a_multi_turn.py](./01-get-started/03a_multi_turn.py)** — Multi-turn conversations with `AgentSession`
3b. **[03b_multi_turn_streaming.py](./01-get-started/03b_multi_turn_streaming.py)** — Multi-turn streaming conversations
python/samples/01-get-started/03b_multi_turn_streaming.py:61
- All other
01-get-startedsamples include XML-style snippet tags (e.g.,<create_agent>...</create_agent>) for docs extraction, but this new sample doesn’t. Please add snippet tags around the client/agent setup and the multi-turn streaming example blocks so it stays consistent with the get-started docs tooling.
async def main() -> None:
# 1. Create the client and agent.
credential = AzureCliCredential()
client = AzureOpenAIResponsesClient(
project_endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
deployment_name=os.environ["AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME"],
credential=credential,
)
agent = client.as_agent(
name="ConversationAgent",
instructions="You are a friendly assistant. Keep your answers brief.",
)
# 2. Create a session to maintain conversation history.
session = agent.create_session()
# 3. First turn — stream the response, then finalize to save history.
print("Agent: ", end="")
stream = agent.run("My name is Alice and I love hiking.", session=session, stream=True)
async for chunk in stream:
if chunk.text:
print(chunk.text, end="", flush=True)
await stream.get_final_response() # Persists messages to the session
print("\n")
# 4. Second turn — the agent remembers context from the first turn.
print("Agent: ", end="")
stream = agent.run("What do you remember about me?", session=session, stream=True)
async for chunk in stream:
if chunk.text:
print(chunk.text, end="", flush=True)
await stream.get_final_response()
python/samples/01-get-started/03b_multi_turn_streaming.py:57
- For streaming output, consider adding
flush=Trueto theprint("Agent: ", end="")calls so the prefix is displayed immediately before the first streamed chunk arrives (this matches the pattern used in other streaming samples in this folder).
print("Agent: ", end="")
stream = agent.run("My name is Alice and I love hiking.", session=session, stream=True)
async for chunk in stream:
if chunk.text:
print(chunk.text, end="", flush=True)
await stream.get_final_response() # Persists messages to the session
print("\n")
# 4. Second turn — the agent remembers context from the first turn.
print("Agent: ", end="")
stream = agent.run("What do you remember about me?", session=session, stream=True)
You can also share your feedback on Copilot code review. Take the survey.
When a ResponseStream is fully consumed via async iteration, automatically trigger finalization (finalizer + result hooks). This ensures session history is persisted in streaming multi-turn conversations without requiring an explicit get_final_response() call. - Add auto-finalize call in __anext__ on StopAsyncIteration - Guard inner stream finalization to prevent double-execution - Re-check _finalized after iteration in get_final_response() - Add tests for auto-finalization and streaming session history - Revert sample file renames from previous commit Closes microsoft#4447 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Auto-finalize
ResponseStreamwhen async iteration completes, so that result hooks (including session history persistence) run without requiring an explicitget_final_response()call.Previously, streaming multi-turn conversations would lose context between turns because the
_post_hookresult hook — which saves messages to the session viaInMemoryHistoryProvider— only ran insideget_final_response(). Users who simply iterated the stream withasync forhad no history persisted.Changes
_types.py—ResponseStream__anext__: Callget_final_response()onStopAsyncIteration(after cleanup hooks) to auto-trigger finalizationget_final_response(): Re-check_finalizedafter iteration to avoid double-finalization when__anext__already finalized; guard inner stream finalization with_finalizedcheck for wrapped/mapped streamstest_types.pytest_auto_finalize_on_iteration_completion— stream is finalized afterasync fortest_auto_finalize_runs_result_hooks— result hooks run without explicitget_final_response()test_get_final_response_idempotent_after_auto_finalize— finalizer runs only oncetest_agents.pytest_chat_client_agent_streaming_session_history_saved_without_get_final_response— session history is persisted after streaming iteration withoutget_final_response()Closes #4447