-
Notifications
You must be signed in to change notification settings - Fork 851
feat(chat): Add tool calls and thinking/reasoning support to mo.ui.chat #7467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(chat): Add tool calls and thinking/reasoning support to mo.ui.chat #7467
Conversation
- Add backend support for streaming structured parts with tool calls - Fix frontend to preserve tool parts when streaming completes - Add OpenAI example with tool calls (openai_with_tools.py) - Add Anthropic example with manual streaming conversion (anthropic_with_tools.py) - Support AI SDK compatible format (toolCallId, state: input-available/output-available) - Yield deltas for text, full parts array for tool calls
…ory format - Render tool call accordions on their own line, separate from message bubble - Only show message bubble if there's non-tool content (text, files) - Fix OpenAI message history to group tool calls properly (single assistant message with all tool_calls) - This prevents OpenAI from re-calling tools that were already executed
The issue was that we combined tool_calls and final text in a single assistant message. OpenAI expects: 1. Assistant message with tool_calls (content=null) - the REQUEST 2. Tool message(s) with results 3. Assistant message with final text - AFTER tools complete By putting final text BEFORE tool results, OpenAI thought the tools weren't properly used and would re-call them on subsequent messages.
The parts in ChatMessage can be either dicts (from yields) or dataclass objects (ToolInvocationPart, TextPart) after marimo processes them. Added helper functions get_part_type() and get_part_attr() to handle both cases when converting message history to OpenAI format. This fixes the bug where assistant messages with tool calls were being skipped because isinstance(p, dict) returned False for dataclass objects.
- Add pydantic_ai class that wraps Pydantic AI Agent for universal LLM support - Support all Pydantic AI providers (OpenAI, Anthropic, Google, Groq, etc.) - Automatic tool execution and streaming with proper tool call display - Add api_key parameter with automatic environment variable setup - Use new_messages() instead of all_messages() to prevent tool call duplication - Add convert_messages_to_pydantic_ai for proper message history handling - Add pydantic_ai dependency to DependencyManager - Add comprehensive tests for the new class - Add pydantic_ai_with_tools.py example showing simplified API This dramatically reduces boilerplate for building chat UIs with tools from ~300 lines to ~20 lines.
Users should use mo.ai.llm.pydantic_ai for tool support instead of implementing the tool call loop manually with raw OpenAI/Anthropic APIs. Deleted: - examples/ai/chat/openai_with_tools.py - examples/ai/chat/anthropic_with_tools.py
- Add enable_thinking parameter to mo.ai.llm.pydantic_ai() for enabling
thinking/reasoning on supported models (Anthropic, OpenAI, Google, Groq)
- Use provider-specific model settings (AnthropicModelSettings, etc.)
to properly handle thinking configuration
- Capture ThinkingPart from pydantic-ai responses and convert to
reasoning parts for display in chat UI
- Update chat-ui.tsx to render reasoning accordions outside the message
bubble, similar to tool call accordions
- Add pydantic_ai_with_thinking.py example demonstrating the feature
Supported providers and settings:
- Anthropic: enable_thinking={"budget_tokens": 1024}
- OpenAI: enable_thinking={"effort": "high", "summary": "detailed"}
- Google: enable_thinking={"include_thoughts": True}
- Groq: enable_thinking={"format": "parsed"}
- Use run_stream_events() API from pydantic-ai for real-time event streaming - Handle PartStartEvent for ThinkingPart to show reasoning immediately - Handle PartDeltaEvent for ThinkingPartDelta and TextPartDelta - When thinking is present, yield structured parts on each text delta to keep reasoning accordion visible during text streaming - Fixes issue where reasoning would disappear during text streaming
- Create pydantic_ai_with_thinking_and_tools.py showing both features - Demonstrates thinking/reasoning with tool calls in same chat - Remove separate pydantic_ai_with_thinking.py (now redundant) - Keep pydantic_ai_with_tools.py for simpler tools-only use case
…ith tools - Store pydantic-ai's native messages in _pydantic_history part after each run - Use all_messages() for history storage to accumulate full conversation - Use new_messages() for tool extraction to avoid picking up old tools - Allow unknown part types to pass through in ChatMessage._convert_part - Handle TextPart in PartStartEvent to capture initial text content - Fixes tool_use/tool_result pairing errors on subsequent Claude requests This enables proper multi-turn conversations with thinking and tools, where the message history is correctly maintained across turns.
…wn parts - Test enable_thinking parameter initialization - Test _pydantic_history extraction from message parts - Test message conversion with stored pydantic-ai history - Test that unknown part types pass through ChatMessage._convert_part
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
for more information, see https://pre-commit.ci
- Document mo.ai.llm.pydantic_ai class with tools and thinking support - Add examples for tool calling and reasoning/thinking - Update streaming examples list with new pydantic_ai examples
- Add base_url parameter to pydantic_ai for OpenAI-compatible APIs - Use OpenAIProvider with custom base_url for W&B Inference - Support reasoning/thinking extraction via enable_thinking flag - Create model objects directly (no env vars needed when api_key provided) - Add wandb_inference_example.py demonstrating W&B integration - Consolidate pydantic_ai examples (remove redundant tools/thinking examples) - Update tests for new direct model creation approach
- Use Pydantic AI Provider pattern instead of setting environment variables - All providers (OpenAI, Anthropic, Google, Groq, Mistral, Cohere) now use their respective Provider classes when api_key is provided - No global state pollution from environment variable side effects - Thread-safe credential handling - Add OpenAIModelProfile for W&B Inference reasoning extraction - Add tests to verify no env vars are set when api_key is provided
|
@mscolnick @Light2Dark Everything is working when manually testing this pr. would love your reviews. |
- Add documentation for using base_url with OpenAI-compatible providers - Add W&B Inference example to streaming examples list - Remove dead link to deleted pydantic_ai_with_tools.py
marimo/_ai/llm/_impl.py
Outdated
| ) | ||
|
|
||
|
|
||
| def _python_type_to_json_schema(py_type: Any) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pydantic AI does this for you
marimo/_ai/llm/_impl.py
Outdated
| return self.model.split(":", 1)[1] | ||
| return self.model | ||
|
|
||
| def _create_model(self) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could live in the user code. that way the user can configure this exactly as they need
BREAKING CHANGE: mo.ai.llm.pydantic_ai now accepts a pydantic_ai.Agent
object instead of model string and configuration options.
This change:
- Simplifies the API by letting users configure the Agent themselves
- Removes internal model creation logic (_create_model, _build_model_settings)
- Removes JSON schema conversion (Pydantic AI handles this)
- Gives users full control over Agent configuration
- Automatically supports all current and future Pydantic AI features
Migration:
Before:
mo.ai.llm.pydantic_ai(
"anthropic:claude-sonnet-4-5",
tools=[get_weather],
api_key="...",
enable_thinking=True,
)
After:
from pydantic_ai import Agent
agent = Agent("anthropic:claude-sonnet-4-5", tools=[get_weather])
mo.ai.llm.pydantic_ai(agent, model_settings=...)
for more information, see https://pre-commit.ci
- Add simple API: mo.ai.llm.pydantic_ai('model', tools=[], instructions='...')
- Support pre-configured Agent for full control: mo.ai.llm.pydantic_ai(agent)
- Pass **kwargs through to Agent for future-proof configuration
- Update examples to use simplified API
- Update docs with both usage patterns
- Update tests for new dual-API approach
|
@mscolnick addressed your comments:
agent = Agent("anthropic:claude-sonnet-4-5", tools=[...], deps_type=MyDeps, ...)
|
marimo/_ai/llm/_impl.py
Outdated
| # Use run_stream_events() to get all events in real-time | ||
| # This includes thinking parts, text parts, tool calls, etc. | ||
| # See: https://ai.pydantic.dev/agents/#streaming-events-and-final-output | ||
| async for event in agent.run_stream_events( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is all logic that we dont need to maintain. pydantic should already do this for you
marimo/_ai/llm/_impl.py
Outdated
| self, | ||
| model: Any, | ||
| *, | ||
| tools: Optional[list[Callable[..., Any]]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having model and tools separate is still not ideal because that implies all providers/models can use tools (which is not true). allowing just an Agent object helps prevent this
marimo/_ai/llm/_impl.py
Outdated
| return None | ||
| return None | ||
|
|
||
| def _convert_messages_to_pydantic_ai( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also handled by pydantic and does not need to live in marimo
| .join("\n"); | ||
|
|
||
| // Separate tool parts, reasoning parts, and other parts for assistant messages | ||
| const toolParts = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can share this with the chat sidebar <RenderParts. we may not have this now, but we can do some refactoring so that there is a generic component to render these parts and we don't need to unpack or adapt them
Addresses PR review feedback: 1. Only accept pydantic_ai.Agent instances (not model strings) - Removes model/tools/instructions/model_settings params - Users configure Agent externally with full Pydantic AI power - Prevents misuse (e.g., passing tools to models that don't support them) 2. Simplify _stream_response event handling - Keep run_stream_events() but reduce internal complexity - Let Pydantic AI handle event structure - Just translate to marimo's streaming format 3. Remove _convert_messages_to_pydantic_ai - Use Pydantic AI's native message history via _pydantic_history - Rely on all_messages_json() / validate_json() for serialization - Remove 170+ lines of manual conversion logic Updated: - Examples: pydantic_ai_with_thinking_and_tools.py, wandb_inference_example.py - Tests: Updated for Agent-only API - Docs: Updated chat.md with new usage patterns
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
- Show animated 'Thinking...' while streaming, char count when done - Keep accordion collapsed by default (user can manually expand) - Pass isStreaming prop to ReasoningAccordion in chat-ui.tsx
| final_result: Any = None | ||
| pending_tool_calls: dict[str, dict[str, Any]] = {} | ||
|
|
||
| def _build_parts() -> list[dict[str, Any]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Light2Dark is adding some features now that should be able to remove this. either we can merge this as-is and delete later or wait (just a few days)
|
Hi Adam, sorry this took a while, actively looking into this. I agree with some of the approaches and hopefully can push something up over the next few days. |
- Remove unused json import - Remove unused UserPromptPart import - Remove unused has_thinking parameter from _process_final_result - Fix biome formatting and template literal issues
|
Ok @mscolnick and @Light2Dark I'll leave it to you if you want to merge now or wait. I have manually tested and everything looks good to me, and have a marimo weave demo that is ready to shine with this new functionality |
Summary
This PR adds comprehensive tool call and thinking/reasoning support to
mo.ui.chat, with a newmo.ai.llm.pydantic_aiclass that simplifies working with LLM tools.Features
1. Tool Call Support in Chat UI
2. Thinking/Reasoning Support
enable_thinkingparameter topydantic_aiclass3. New
mo.ai.llm.pydantic_aiClasstools- pydantic-ai handles the rest4. Multi-turn Conversation Support
tool_use/tool_resultpairing (required by Claude)ChatMessageinstead of being discarded5. W&B Inference & OpenAI-Compatible API Support
base_urlparameter for connecting to OpenAI-compatible endpointsapi_keyparameter for direct credential passing (no env vars needed)OpenAIModelProfileExample Usage
With Anthropic (Tools + Thinking)
With W&B Inference (Reasoning Models)
Files Changed
Core Changes
marimo/_ai/llm/_impl.py- Newpydantic_aiclass with streaming, tools, thinking, and W&B supportmarimo/_ai/_types.py- Allow unknown part types to pass throughFrontend
frontend/src/plugins/impl/chat/chat-ui.tsx- Render tool calls and reasoning accordionsExamples
examples/ai/chat/pydantic_ai_with_thinking_and_tools.py- Combined example with Anthropicexamples/ai/chat/wandb_inference_example.py- W&B Inference with reasoning modelsTests
tests/_ai/llm/test_impl.py- Tests for thinking, history storage, Provider pattern, and W&B supportTesting
Screenshots
The chat UI now shows: