feat(chat): Add tool calls and thinking/reasoning support to mo.ui.chat #7467

adamwdraper · 2025-12-11T05:57:24Z

Summary

This PR adds comprehensive tool call and thinking/reasoning support to mo.ui.chat, with a new mo.ai.llm.pydantic_ai class that simplifies working with LLM tools.

Features

1. Tool Call Support in Chat UI

Tool calls are displayed as collapsible accordions showing input/output
Tools appear before the message bubble (same pattern as reasoning)
Real-time streaming: tools show "calling" state, then update with results

2. Thinking/Reasoning Support

Added enable_thinking parameter to pydantic_ai class
Reasoning appears as a collapsible accordion before the response
Streams in real-time as the LLM thinks

3. New `mo.ai.llm.pydantic_ai` Class

Simplified tool integration using pydantic-ai
Just pass Python functions as tools - pydantic-ai handles the rest
Supports all major providers: OpenAI, Anthropic, Google, Groq, etc.

4. Multi-turn Conversation Support

Properly maintains message history across turns
Stores pydantic-ai's native messages to ensure correct tool_use/tool_result pairing (required by Claude)
Unknown part types now pass through ChatMessage instead of being discarded

5. W&B Inference & OpenAI-Compatible API Support

Added base_url parameter for connecting to OpenAI-compatible endpoints
Added api_key parameter for direct credential passing (no env vars needed)
Uses Pydantic AI Provider pattern for thread-safe, no-side-effect credential handling
Automatic reasoning extraction from W&B Inference models via OpenAIModelProfile
See W&B Inference docs for available models

Example Usage

With Anthropic (Tools + Thinking)

import marimo as mo

def get_weather(location: str) -> dict:
    """Get current weather for a location."""
    return {"temperature": 72, "conditions": "sunny"}

chatbot = mo.ui.chat(
    mo.ai.llm.pydantic_ai(
        "anthropic:claude-sonnet-4-5",
        tools=[get_weather],
        enable_thinking=True,
        api_key=api_key,
    ),
)

With W&B Inference (Reasoning Models)

import marimo as mo

chatbot = mo.ui.chat(
    mo.ai.llm.pydantic_ai(
        "openai:deepseek-ai/DeepSeek-R1-0528",
        base_url="https://api.inference.wandb.ai/v1",
        api_key=wandb_api_key,
        enable_thinking=True,  # Extracts reasoning_content from response
    ),
)

Files Changed

Core Changes

marimo/_ai/llm/_impl.py - New pydantic_ai class with streaming, tools, thinking, and W&B support
marimo/_ai/_types.py - Allow unknown part types to pass through

Frontend

frontend/src/plugins/impl/chat/chat-ui.tsx - Render tool calls and reasoning accordions

Examples

examples/ai/chat/pydantic_ai_with_thinking_and_tools.py - Combined example with Anthropic
examples/ai/chat/wandb_inference_example.py - W&B Inference with reasoning models

Tests

tests/_ai/llm/test_impl.py - Tests for thinking, history storage, Provider pattern, and W&B support

Testing

Tested multi-turn conversations with tools and thinking on Claude
Tested W&B Inference with DeepSeek R1 reasoning model
Verified tool calls display correctly during streaming
Verified reasoning appears before response
Added unit tests for new functionality

Screenshots

The chat UI now shows:

Reasoning accordion (collapsible) - appears first as LLM thinks
Tool call accordions (collapsible) - shows tool name, inputs, and outputs
Response text - final synthesized answer

- Add backend support for streaming structured parts with tool calls - Fix frontend to preserve tool parts when streaming completes - Add OpenAI example with tool calls (openai_with_tools.py) - Add Anthropic example with manual streaming conversion (anthropic_with_tools.py) - Support AI SDK compatible format (toolCallId, state: input-available/output-available) - Yield deltas for text, full parts array for tool calls

…ory format - Render tool call accordions on their own line, separate from message bubble - Only show message bubble if there's non-tool content (text, files) - Fix OpenAI message history to group tool calls properly (single assistant message with all tool_calls) - This prevents OpenAI from re-calling tools that were already executed

The issue was that we combined tool_calls and final text in a single assistant message. OpenAI expects: 1. Assistant message with tool_calls (content=null) - the REQUEST 2. Tool message(s) with results 3. Assistant message with final text - AFTER tools complete By putting final text BEFORE tool results, OpenAI thought the tools weren't properly used and would re-call them on subsequent messages.

The parts in ChatMessage can be either dicts (from yields) or dataclass objects (ToolInvocationPart, TextPart) after marimo processes them. Added helper functions get_part_type() and get_part_attr() to handle both cases when converting message history to OpenAI format. This fixes the bug where assistant messages with tool calls were being skipped because isinstance(p, dict) returned False for dataclass objects.

- Add pydantic_ai class that wraps Pydantic AI Agent for universal LLM support - Support all Pydantic AI providers (OpenAI, Anthropic, Google, Groq, etc.) - Automatic tool execution and streaming with proper tool call display - Add api_key parameter with automatic environment variable setup - Use new_messages() instead of all_messages() to prevent tool call duplication - Add convert_messages_to_pydantic_ai for proper message history handling - Add pydantic_ai dependency to DependencyManager - Add comprehensive tests for the new class - Add pydantic_ai_with_tools.py example showing simplified API This dramatically reduces boilerplate for building chat UIs with tools from ~300 lines to ~20 lines.

Users should use mo.ai.llm.pydantic_ai for tool support instead of implementing the tool call loop manually with raw OpenAI/Anthropic APIs. Deleted: - examples/ai/chat/openai_with_tools.py - examples/ai/chat/anthropic_with_tools.py

…ols.py

- Add enable_thinking parameter to mo.ai.llm.pydantic_ai() for enabling thinking/reasoning on supported models (Anthropic, OpenAI, Google, Groq) - Use provider-specific model settings (AnthropicModelSettings, etc.) to properly handle thinking configuration - Capture ThinkingPart from pydantic-ai responses and convert to reasoning parts for display in chat UI - Update chat-ui.tsx to render reasoning accordions outside the message bubble, similar to tool call accordions - Add pydantic_ai_with_thinking.py example demonstrating the feature Supported providers and settings: - Anthropic: enable_thinking={"budget_tokens": 1024} - OpenAI: enable_thinking={"effort": "high", "summary": "detailed"} - Google: enable_thinking={"include_thoughts": True} - Groq: enable_thinking={"format": "parsed"}

- Use run_stream_events() API from pydantic-ai for real-time event streaming - Handle PartStartEvent for ThinkingPart to show reasoning immediately - Handle PartDeltaEvent for ThinkingPartDelta and TextPartDelta - When thinking is present, yield structured parts on each text delta to keep reasoning accordion visible during text streaming - Fixes issue where reasoning would disappear during text streaming

- Create pydantic_ai_with_thinking_and_tools.py showing both features - Demonstrates thinking/reasoning with tool calls in same chat - Remove separate pydantic_ai_with_thinking.py (now redundant) - Keep pydantic_ai_with_tools.py for simpler tools-only use case

…ith tools - Store pydantic-ai's native messages in _pydantic_history part after each run - Use all_messages() for history storage to accumulate full conversation - Use new_messages() for tool extraction to avoid picking up old tools - Allow unknown part types to pass through in ChatMessage._convert_part - Handle TextPart in PartStartEvent to capture initial text content - Fixes tool_use/tool_result pairing errors on subsequent Claude requests This enables proper multi-turn conversations with thinking and tools, where the message history is correctly maintained across turns.

…wn parts - Test enable_thinking parameter initialization - Test _pydantic_history extraction from message parts - Test message conversion with stored pydantic-ai history - Test that unknown part types pass through ChatMessage._convert_part

vercel · 2025-12-11T05:57:28Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Dec 19, 2025 6:21pm

for more information, see https://pre-commit.ci

- Document mo.ai.llm.pydantic_ai class with tools and thinking support - Add examples for tool calling and reasoning/thinking - Update streaming examples list with new pydantic_ai examples

- Add base_url parameter to pydantic_ai for OpenAI-compatible APIs - Use OpenAIProvider with custom base_url for W&B Inference - Support reasoning/thinking extraction via enable_thinking flag - Create model objects directly (no env vars needed when api_key provided) - Add wandb_inference_example.py demonstrating W&B integration - Consolidate pydantic_ai examples (remove redundant tools/thinking examples) - Update tests for new direct model creation approach

- Use Pydantic AI Provider pattern instead of setting environment variables - All providers (OpenAI, Anthropic, Google, Groq, Mistral, Cohere) now use their respective Provider classes when api_key is provided - No global state pollution from environment variable side effects - Thread-safe credential handling - Add OpenAIModelProfile for W&B Inference reasoning extraction - Add tests to verify no env vars are set when api_key is provided

adamwdraper · 2025-12-16T18:25:56Z

@mscolnick @Light2Dark Everything is working when manually testing this pr. would love your reviews.

- Add documentation for using base_url with OpenAI-compatible providers - Add W&B Inference example to streaming examples list - Remove dead link to deleted pydantic_ai_with_tools.py

mscolnick · 2025-12-16T20:06:49Z

marimo/_ai/llm/_impl.py

 )


+def _python_type_to_json_schema(py_type: Any) -> dict[str, Any]:


pydantic AI does this for you

mscolnick · 2025-12-16T20:08:04Z

marimo/_ai/llm/_impl.py

+            return self.model.split(":", 1)[1]
+        return self.model
+
+    def _create_model(self) -> Any:


this could live in the user code. that way the user can configure this exactly as they need

BREAKING CHANGE: mo.ai.llm.pydantic_ai now accepts a pydantic_ai.Agent object instead of model string and configuration options. This change: - Simplifies the API by letting users configure the Agent themselves - Removes internal model creation logic (_create_model, _build_model_settings) - Removes JSON schema conversion (Pydantic AI handles this) - Gives users full control over Agent configuration - Automatically supports all current and future Pydantic AI features Migration: Before: mo.ai.llm.pydantic_ai( "anthropic:claude-sonnet-4-5", tools=[get_weather], api_key="...", enable_thinking=True, ) After: from pydantic_ai import Agent agent = Agent("anthropic:claude-sonnet-4-5", tools=[get_weather]) mo.ai.llm.pydantic_ai(agent, model_settings=...)

for more information, see https://pre-commit.ci

- Add simple API: mo.ai.llm.pydantic_ai('model', tools=[], instructions='...') - Support pre-configured Agent for full control: mo.ai.llm.pydantic_ai(agent) - Pass **kwargs through to Agent for future-proof configuration - Update examples to use simplified API - Update docs with both usage patterns - Update tests for new dual-API approach

adamwdraper · 2025-12-18T00:24:28Z

@mscolnick addressed your comments:

Simplified pydantic_ai to accept both model string AND Agent
You can now use it two ways:
Simple (like other mo.ai.llm classes):
mo.ai.llm.pydantic_ai( "anthropic:claude-sonnet-4-5", tools=[get_weather], instructions="You are helpful.", model_settings=AnthropicModelSettings(...), )
Existing pydantic ai user (pass a pre-configured Agent):
`from pydantic_ai import Agent

agent = Agent("anthropic:claude-sonnet-4-5", tools=[...], deps_type=MyDeps, ...)
mo.ai.llm.pydantic_ai(agent)
`

Removed internal complexity
❌ Removed _python_type_to_json_schema - Pydantic AI handles this
❌ Removed _create_model, _get_provider, _get_model_name - Agent creation delegated to Pydantic AI
❌ Removed _parse_docstring_args, _function_to_openai_tool - not needed
Future-proof via **kwargs
Any new Agent parameters Pydantic AI adds will work automatically without code changes.
The class is now a thin wrapper (~130 lines of logic) that just bridges Pydantic AI to marimo's chat UI.

mscolnick · 2025-12-18T15:41:06Z

marimo/_ai/llm/_impl.py

+        # Use run_stream_events() to get all events in real-time
+        # This includes thinking parts, text parts, tool calls, etc.
+        # See: https://ai.pydantic.dev/agents/#streaming-events-and-final-output
+        async for event in agent.run_stream_events(


this is all logic that we dont need to maintain. pydantic should already do this for you

mscolnick · 2025-12-18T15:41:53Z

marimo/_ai/llm/_impl.py

+        self,
+        model: Any,
+        *,
+        tools: Optional[list[Callable[..., Any]]] = None,


having model and tools separate is still not ideal because that implies all providers/models can use tools (which is not true). allowing just an Agent object helps prevent this

mscolnick · 2025-12-18T15:42:24Z

marimo/_ai/llm/_impl.py

+                        return None
+        return None
+
+    def _convert_messages_to_pydantic_ai(


this is also handled by pydantic and does not need to live in marimo

mscolnick · 2025-12-18T15:44:25Z

frontend/src/plugins/impl/chat/chat-ui.tsx

            .join("\n");

+          // Separate tool parts, reasoning parts, and other parts for assistant messages
+          const toolParts =


we can share this with the chat sidebar <RenderParts. we may not have this now, but we can do some refactoring so that there is a generic component to render these parts and we don't need to unpack or adapt them

Addresses PR review feedback: 1. Only accept pydantic_ai.Agent instances (not model strings) - Removes model/tools/instructions/model_settings params - Users configure Agent externally with full Pydantic AI power - Prevents misuse (e.g., passing tools to models that don't support them) 2. Simplify _stream_response event handling - Keep run_stream_events() but reduce internal complexity - Let Pydantic AI handle event structure - Just translate to marimo's streaming format 3. Remove _convert_messages_to_pydantic_ai - Use Pydantic AI's native message history via _pydantic_history - Rely on all_messages_json() / validate_json() for serialization - Remove 170+ lines of manual conversion logic Updated: - Examples: pydantic_ai_with_thinking_and_tools.py, wandb_inference_example.py - Tests: Updated for Agent-only API - Docs: Updated chat.md with new usage patterns

for more information, see https://pre-commit.ci

- Show animated 'Thinking...' while streaming, char count when done - Keep accordion collapsed by default (user can manually expand) - Pass isStreaming prop to ReasoningAccordion in chat-ui.tsx

mscolnick · 2025-12-19T16:38:47Z

marimo/_ai/llm/_impl.py

+        final_result: Any = None
+        pending_tool_calls: dict[str, dict[str, Any]] = {}
+
+        def _build_parts() -> list[dict[str, Any]]:


@Light2Dark is adding some features now that should be able to remove this. either we can merge this as-is and delete later or wait (just a few days)

Light2Dark · 2025-12-19T16:40:26Z

Hi Adam, sorry this took a while, actively looking into this. I agree with some of the approaches and hopefully can push something up over the next few days.

- Remove unused json import - Remove unused UserPromptPart import - Remove unused has_thinking parameter from _process_final_result - Fix biome formatting and template literal issues

adamwdraper · 2025-12-19T19:34:45Z

Ok @mscolnick and @Light2Dark I'll leave it to you if you want to merge now or wait. I have manually tested and everything looks good to me, and have a marimo weave demo that is ready to shine with this new functionality

adamwdraper added 13 commits December 10, 2025 11:52

chore: Remove debug print statements from chat.py

2b9f2a9

Remove manual tool examples in favor of pydantic_ai

e3bb5b7

Users should use mo.ai.llm.pydantic_ai for tool support instead of implementing the tool call loop manually with raw OpenAI/Anthropic APIs. Deleted: - examples/ai/chat/openai_with_tools.py - examples/ai/chat/anthropic_with_tools.py

fix(example): Add missing chatbot display line in pydantic_ai_with_to…

2e8fd8d

…ols.py

adamwdraper requested review from Light2Dark, akshayka and manzt as code owners December 11, 2025 05:57

vercel bot deployed to Preview December 11, 2025 05:58 View deployment

[pre-commit.ci] auto fixes from pre-commit.com hooks

f213ba1

for more information, see https://pre-commit.ci

vercel bot deployed to Preview December 11, 2025 05:59 View deployment

docs(chat): add pydantic_ai documentation with tools and thinking

fa8be9c

- Document mo.ai.llm.pydantic_ai class with tools and thinking support - Add examples for tool calling and reasoning/thinking - Update streaming examples list with new pydantic_ai examples

vercel bot deployed to Preview December 11, 2025 06:03 View deployment

github-actions bot added the documentation Improvements or additions to documentation label Dec 11, 2025

adamwdraper added 3 commits December 16, 2025 09:35

test(pydantic_ai): add test for base_url without thinking enabled

66efb65

vercel bot deployed to Preview December 16, 2025 18:18 View deployment

docs(chat): add W&B Inference and base_url documentation for pydantic_ai

9ced6a1

- Add documentation for using base_url with OpenAI-compatible providers - Add W&B Inference example to streaming examples list - Remove dead link to deleted pydantic_ai_with_tools.py

vercel bot deployed to Preview December 16, 2025 18:44 View deployment

mscolnick reviewed Dec 16, 2025

View reviewed changes

vercel bot deployed to Preview December 16, 2025 22:36 View deployment

[pre-commit.ci] auto fixes from pre-commit.com hooks

5e3e8b3

for more information, see https://pre-commit.ci

vercel bot deployed to Preview December 16, 2025 22:38 View deployment

vercel bot deployed to Preview December 17, 2025 23:12 View deployment

mscolnick reviewed Dec 18, 2025

View reviewed changes

vercel bot deployed to Preview December 19, 2025 15:56 View deployment

[pre-commit.ci] auto fixes from pre-commit.com hooks

e2a90cd

for more information, see https://pre-commit.ci

vercel bot deployed to Preview December 19, 2025 15:57 View deployment

fix(chat): rename reasoning accordion label to 'Thinking'

afcb4e5

vercel bot deployed to Preview December 19, 2025 16:02 View deployment

feat(chat): animate ellipsis while thinking, show char count when done

7e352f8

vercel bot deployed to Preview December 19, 2025 16:08 View deployment

[pre-commit.ci] auto fixes from pre-commit.com hooks

9bc13cb

for more information, see https://pre-commit.ci

vercel bot deployed to Preview December 19, 2025 16:10 View deployment

feat(chat): animate thinking ellipsis, keep accordion collapsed

c482868

- Show animated 'Thinking...' while streaming, char count when done - Keep accordion collapsed by default (user can manually expand) - Pass isStreaming prop to ReasoningAccordion in chat-ui.tsx

vercel bot deployed to Preview December 19, 2025 16:26 View deployment

mscolnick reviewed Dec 19, 2025

View reviewed changes

fix: address pre-commit lint errors

e2c6550

- Remove unused json import - Remove unused UserPromptPart import - Remove unused has_thinking parameter from _process_final_result - Fix biome formatting and template literal issues

vercel bot deployed to Preview December 19, 2025 18:21 View deployment

		)


		def _python_type_to_json_schema(py_type: Any) -> dict[str, Any]:

feat(chat): Add tool calls and thinking/reasoning support to mo.ui.chat #7467

Are you sure you want to change the base?

feat(chat): Add tool calls and thinking/reasoning support to mo.ui.chat #7467

Uh oh!

Conversation

adamwdraper commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Features

1. Tool Call Support in Chat UI

2. Thinking/Reasoning Support

3. New mo.ai.llm.pydantic_ai Class

4. Multi-turn Conversation Support

5. W&B Inference & OpenAI-Compatible API Support

Example Usage

With Anthropic (Tools + Thinking)

With W&B Inference (Reasoning Models)

Files Changed

Core Changes

Frontend

Examples

Tests

Testing

Screenshots

Uh oh!

vercel bot commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamwdraper commented Dec 16, 2025

Uh oh!

mscolnick Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

mscolnick Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

adamwdraper commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mscolnick Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

mscolnick Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

mscolnick Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

mscolnick Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

mscolnick Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

Light2Dark commented Dec 19, 2025

Uh oh!

adamwdraper commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

adamwdraper commented Dec 11, 2025 •

edited

Loading

3. New `mo.ai.llm.pydantic_ai` Class

vercel bot commented Dec 11, 2025 •

edited

Loading

adamwdraper commented Dec 18, 2025 •

edited

Loading