Skip to content

Conversation

@adamwdraper
Copy link
Contributor

@adamwdraper adamwdraper commented Dec 11, 2025

Summary

This PR adds comprehensive tool call and thinking/reasoning support to mo.ui.chat, with a new mo.ai.llm.pydantic_ai class that simplifies working with LLM tools.

reasoning tools

Features

1. Tool Call Support in Chat UI

  • Tool calls are displayed as collapsible accordions showing input/output
  • Tools appear before the message bubble (same pattern as reasoning)
  • Real-time streaming: tools show "calling" state, then update with results

2. Thinking/Reasoning Support

  • Added enable_thinking parameter to pydantic_ai class
  • Reasoning appears as a collapsible accordion before the response
  • Streams in real-time as the LLM thinks

3. New mo.ai.llm.pydantic_ai Class

  • Simplified tool integration using pydantic-ai
  • Just pass Python functions as tools - pydantic-ai handles the rest
  • Supports all major providers: OpenAI, Anthropic, Google, Groq, etc.

4. Multi-turn Conversation Support

  • Properly maintains message history across turns
  • Stores pydantic-ai's native messages to ensure correct tool_use/tool_result pairing (required by Claude)
  • Unknown part types now pass through ChatMessage instead of being discarded

5. W&B Inference & OpenAI-Compatible API Support

  • Added base_url parameter for connecting to OpenAI-compatible endpoints
  • Added api_key parameter for direct credential passing (no env vars needed)
  • Uses Pydantic AI Provider pattern for thread-safe, no-side-effect credential handling
  • Automatic reasoning extraction from W&B Inference models via OpenAIModelProfile
  • See W&B Inference docs for available models

Example Usage

With Anthropic (Tools + Thinking)

import marimo as mo

def get_weather(location: str) -> dict:
    """Get current weather for a location."""
    return {"temperature": 72, "conditions": "sunny"}

chatbot = mo.ui.chat(
    mo.ai.llm.pydantic_ai(
        "anthropic:claude-sonnet-4-5",
        tools=[get_weather],
        enable_thinking=True,
        api_key=api_key,
    ),
)

With W&B Inference (Reasoning Models)

import marimo as mo

chatbot = mo.ui.chat(
    mo.ai.llm.pydantic_ai(
        "openai:deepseek-ai/DeepSeek-R1-0528",
        base_url="https://api.inference.wandb.ai/v1",
        api_key=wandb_api_key,
        enable_thinking=True,  # Extracts reasoning_content from response
    ),
)

Files Changed

Core Changes

  • marimo/_ai/llm/_impl.py - New pydantic_ai class with streaming, tools, thinking, and W&B support
  • marimo/_ai/_types.py - Allow unknown part types to pass through

Frontend

  • frontend/src/plugins/impl/chat/chat-ui.tsx - Render tool calls and reasoning accordions

Examples

  • examples/ai/chat/pydantic_ai_with_thinking_and_tools.py - Combined example with Anthropic
  • examples/ai/chat/wandb_inference_example.py - W&B Inference with reasoning models

Tests

  • tests/_ai/llm/test_impl.py - Tests for thinking, history storage, Provider pattern, and W&B support

Testing

  • Tested multi-turn conversations with tools and thinking on Claude
  • Tested W&B Inference with DeepSeek R1 reasoning model
  • Verified tool calls display correctly during streaming
  • Verified reasoning appears before response
  • Added unit tests for new functionality

Screenshots

The chat UI now shows:

  1. Reasoning accordion (collapsible) - appears first as LLM thinks
  2. Tool call accordions (collapsible) - shows tool name, inputs, and outputs
  3. Response text - final synthesized answer

- Add backend support for streaming structured parts with tool calls
- Fix frontend to preserve tool parts when streaming completes
- Add OpenAI example with tool calls (openai_with_tools.py)
- Add Anthropic example with manual streaming conversion (anthropic_with_tools.py)
- Support AI SDK compatible format (toolCallId, state: input-available/output-available)
- Yield deltas for text, full parts array for tool calls
…ory format

- Render tool call accordions on their own line, separate from message bubble
- Only show message bubble if there's non-tool content (text, files)
- Fix OpenAI message history to group tool calls properly (single assistant message with all tool_calls)
- This prevents OpenAI from re-calling tools that were already executed
The issue was that we combined tool_calls and final text in a single
assistant message. OpenAI expects:

1. Assistant message with tool_calls (content=null) - the REQUEST
2. Tool message(s) with results
3. Assistant message with final text - AFTER tools complete

By putting final text BEFORE tool results, OpenAI thought the tools
weren't properly used and would re-call them on subsequent messages.
The parts in ChatMessage can be either dicts (from yields) or dataclass
objects (ToolInvocationPart, TextPart) after marimo processes them.

Added helper functions get_part_type() and get_part_attr() to handle
both cases when converting message history to OpenAI format.

This fixes the bug where assistant messages with tool calls were being
skipped because isinstance(p, dict) returned False for dataclass objects.
- Add pydantic_ai class that wraps Pydantic AI Agent for universal LLM support
- Support all Pydantic AI providers (OpenAI, Anthropic, Google, Groq, etc.)
- Automatic tool execution and streaming with proper tool call display
- Add api_key parameter with automatic environment variable setup
- Use new_messages() instead of all_messages() to prevent tool call duplication
- Add convert_messages_to_pydantic_ai for proper message history handling
- Add pydantic_ai dependency to DependencyManager
- Add comprehensive tests for the new class
- Add pydantic_ai_with_tools.py example showing simplified API

This dramatically reduces boilerplate for building chat UIs with tools
from ~300 lines to ~20 lines.
Users should use mo.ai.llm.pydantic_ai for tool support instead of
implementing the tool call loop manually with raw OpenAI/Anthropic APIs.

Deleted:
- examples/ai/chat/openai_with_tools.py
- examples/ai/chat/anthropic_with_tools.py
- Add enable_thinking parameter to mo.ai.llm.pydantic_ai() for enabling
  thinking/reasoning on supported models (Anthropic, OpenAI, Google, Groq)
- Use provider-specific model settings (AnthropicModelSettings, etc.)
  to properly handle thinking configuration
- Capture ThinkingPart from pydantic-ai responses and convert to
  reasoning parts for display in chat UI
- Update chat-ui.tsx to render reasoning accordions outside the message
  bubble, similar to tool call accordions
- Add pydantic_ai_with_thinking.py example demonstrating the feature

Supported providers and settings:
- Anthropic: enable_thinking={"budget_tokens": 1024}
- OpenAI: enable_thinking={"effort": "high", "summary": "detailed"}
- Google: enable_thinking={"include_thoughts": True}
- Groq: enable_thinking={"format": "parsed"}
- Use run_stream_events() API from pydantic-ai for real-time event streaming
- Handle PartStartEvent for ThinkingPart to show reasoning immediately
- Handle PartDeltaEvent for ThinkingPartDelta and TextPartDelta
- When thinking is present, yield structured parts on each text delta
  to keep reasoning accordion visible during text streaming
- Fixes issue where reasoning would disappear during text streaming
- Create pydantic_ai_with_thinking_and_tools.py showing both features
- Demonstrates thinking/reasoning with tool calls in same chat
- Remove separate pydantic_ai_with_thinking.py (now redundant)
- Keep pydantic_ai_with_tools.py for simpler tools-only use case
…ith tools

- Store pydantic-ai's native messages in _pydantic_history part after each run
- Use all_messages() for history storage to accumulate full conversation
- Use new_messages() for tool extraction to avoid picking up old tools
- Allow unknown part types to pass through in ChatMessage._convert_part
- Handle TextPart in PartStartEvent to capture initial text content
- Fixes tool_use/tool_result pairing errors on subsequent Claude requests

This enables proper multi-turn conversations with thinking and tools,
where the message history is correctly maintained across turns.
…wn parts

- Test enable_thinking parameter initialization
- Test _pydantic_history extraction from message parts
- Test message conversion with stored pydantic-ai history
- Test that unknown part types pass through ChatMessage._convert_part
@vercel
Copy link

vercel bot commented Dec 11, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
marimo-docs Ready Ready Preview, Comment Dec 19, 2025 6:21pm

- Document mo.ai.llm.pydantic_ai class with tools and thinking support
- Add examples for tool calling and reasoning/thinking
- Update streaming examples list with new pydantic_ai examples
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 11, 2025
- Add base_url parameter to pydantic_ai for OpenAI-compatible APIs
- Use OpenAIProvider with custom base_url for W&B Inference
- Support reasoning/thinking extraction via enable_thinking flag
- Create model objects directly (no env vars needed when api_key provided)
- Add wandb_inference_example.py demonstrating W&B integration
- Consolidate pydantic_ai examples (remove redundant tools/thinking examples)
- Update tests for new direct model creation approach
- Use Pydantic AI Provider pattern instead of setting environment variables
- All providers (OpenAI, Anthropic, Google, Groq, Mistral, Cohere) now use
  their respective Provider classes when api_key is provided
- No global state pollution from environment variable side effects
- Thread-safe credential handling
- Add OpenAIModelProfile for W&B Inference reasoning extraction
- Add tests to verify no env vars are set when api_key is provided
@adamwdraper
Copy link
Contributor Author

@mscolnick @Light2Dark Everything is working when manually testing this pr. would love your reviews.

- Add documentation for using base_url with OpenAI-compatible providers
- Add W&B Inference example to streaming examples list
- Remove dead link to deleted pydantic_ai_with_tools.py
)


def _python_type_to_json_schema(py_type: Any) -> dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pydantic AI does this for you

return self.model.split(":", 1)[1]
return self.model

def _create_model(self) -> Any:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could live in the user code. that way the user can configure this exactly as they need

BREAKING CHANGE: mo.ai.llm.pydantic_ai now accepts a pydantic_ai.Agent
object instead of model string and configuration options.

This change:
- Simplifies the API by letting users configure the Agent themselves
- Removes internal model creation logic (_create_model, _build_model_settings)
- Removes JSON schema conversion (Pydantic AI handles this)
- Gives users full control over Agent configuration
- Automatically supports all current and future Pydantic AI features

Migration:
Before:
  mo.ai.llm.pydantic_ai(
      "anthropic:claude-sonnet-4-5",
      tools=[get_weather],
      api_key="...",
      enable_thinking=True,
  )

After:
  from pydantic_ai import Agent
  agent = Agent("anthropic:claude-sonnet-4-5", tools=[get_weather])
  mo.ai.llm.pydantic_ai(agent, model_settings=...)
- Add simple API: mo.ai.llm.pydantic_ai('model', tools=[], instructions='...')
- Support pre-configured Agent for full control: mo.ai.llm.pydantic_ai(agent)
- Pass **kwargs through to Agent for future-proof configuration
- Update examples to use simplified API
- Update docs with both usage patterns
- Update tests for new dual-API approach
@adamwdraper
Copy link
Contributor Author

adamwdraper commented Dec 18, 2025

@mscolnick addressed your comments:

  1. Simplified pydantic_ai to accept both model string AND Agent
    You can now use it two ways:
    Simple (like other mo.ai.llm classes):
    mo.ai.llm.pydantic_ai( "anthropic:claude-sonnet-4-5", tools=[get_weather], instructions="You are helpful.", model_settings=AnthropicModelSettings(...), )
    Existing pydantic ai user (pass a pre-configured Agent):
    `from pydantic_ai import Agent

agent = Agent("anthropic:claude-sonnet-4-5", tools=[...], deps_type=MyDeps, ...)
mo.ai.llm.pydantic_ai(agent)
`

  1. Removed internal complexity
    ❌ Removed _python_type_to_json_schema - Pydantic AI handles this
    ❌ Removed _create_model, _get_provider, _get_model_name - Agent creation delegated to Pydantic AI
    ❌ Removed _parse_docstring_args, _function_to_openai_tool - not needed

  2. Future-proof via **kwargs
    Any new Agent parameters Pydantic AI adds will work automatically without code changes.
    The class is now a thin wrapper (~130 lines of logic) that just bridges Pydantic AI to marimo's chat UI.

# Use run_stream_events() to get all events in real-time
# This includes thinking parts, text parts, tool calls, etc.
# See: https://ai.pydantic.dev/agents/#streaming-events-and-final-output
async for event in agent.run_stream_events(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is all logic that we dont need to maintain. pydantic should already do this for you

self,
model: Any,
*,
tools: Optional[list[Callable[..., Any]]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having model and tools separate is still not ideal because that implies all providers/models can use tools (which is not true). allowing just an Agent object helps prevent this

return None
return None

def _convert_messages_to_pydantic_ai(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is also handled by pydantic and does not need to live in marimo

.join("\n");

// Separate tool parts, reasoning parts, and other parts for assistant messages
const toolParts =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can share this with the chat sidebar <RenderParts. we may not have this now, but we can do some refactoring so that there is a generic component to render these parts and we don't need to unpack or adapt them

Addresses PR review feedback:

1. Only accept pydantic_ai.Agent instances (not model strings)
   - Removes model/tools/instructions/model_settings params
   - Users configure Agent externally with full Pydantic AI power
   - Prevents misuse (e.g., passing tools to models that don't support them)

2. Simplify _stream_response event handling
   - Keep run_stream_events() but reduce internal complexity
   - Let Pydantic AI handle event structure
   - Just translate to marimo's streaming format

3. Remove _convert_messages_to_pydantic_ai
   - Use Pydantic AI's native message history via _pydantic_history
   - Rely on all_messages_json() / validate_json() for serialization
   - Remove 170+ lines of manual conversion logic

Updated:
- Examples: pydantic_ai_with_thinking_and_tools.py, wandb_inference_example.py
- Tests: Updated for Agent-only API
- Docs: Updated chat.md with new usage patterns
- Show animated 'Thinking...' while streaming, char count when done
- Keep accordion collapsed by default (user can manually expand)
- Pass isStreaming prop to ReasoningAccordion in chat-ui.tsx
final_result: Any = None
pending_tool_calls: dict[str, dict[str, Any]] = {}

def _build_parts() -> list[dict[str, Any]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Light2Dark is adding some features now that should be able to remove this. either we can merge this as-is and delete later or wait (just a few days)

@Light2Dark
Copy link
Contributor

Hi Adam, sorry this took a while, actively looking into this. I agree with some of the approaches and hopefully can push something up over the next few days.

- Remove unused json import
- Remove unused UserPromptPart import
- Remove unused has_thinking parameter from _process_final_result
- Fix biome formatting and template literal issues
@adamwdraper
Copy link
Contributor Author

Ok @mscolnick and @Light2Dark I'll leave it to you if you want to merge now or wait. I have manually tested and everything looks good to me, and have a marimo weave demo that is ready to shine with this new functionality

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants