-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededvisualizationCharts, graphs and reportingCharts, graphs and reporting
Description
Summary
The HTML visual report (evalview report) generates a Mermaid sequence diagram for each test. For multi-turn tests, all turns' tool calls are merged into one flat diagram with no indication of where each turn starts and ends.
Current behaviour
The Mermaid diagram shows all steps in one flat sequence — for a 3-turn booking test you get 8 tool calls with no turn boundaries visible.
Desired behaviour
Add visual separators between turns in the Mermaid diagram:
sequenceDiagram
participant User
participant Agent
Note over User,Agent: Turn 1 — "Find flights NYC to Paris"
User->>Agent: find flights NYC to Paris
Agent->>Flights API: search_flights(origin=NYC, dest=CDG)
Flights API-->>Agent: [results]
Note over User,Agent: Turn 2 — "Book the cheapest one"
User->>Agent: book the cheapest one
Agent->>Bookings API: book_flight(flight_id=AA123)
...
Also add a "Turns" section in the test result card showing each turn's query and tool calls in collapsible <details> blocks.
Implementation hints
- Mermaid generation is in
evalview/visualization/generators.py - The
ExecutionTracehas a flatstepslist — you'll need to tag steps with their turn index- Option A (simpler): Add a
turn_index: Optional[int]field toExecutionStepinevalview/core/types.pyand set it in_execute_multi_turn_trace - Option B (no schema change): Use step timestamps to infer boundaries from
turn_tracesend times
- Option A (simpler): Add a
- Mermaid
Note oversyntax for turn headers:Note over User,Agent: Turn 1 — query text
Files to touch
evalview/visualization/generators.py— Mermaid generationevalview/core/types.py— optionally addturn_indextoExecutionStepevalview/cli.py—_execute_multi_turn_traceto tag steps with turn index
Acceptance criteria
- Mermaid diagram shows
Note overseparators between turns - Each turn's query appears in the note text (truncated to 60 chars)
- Single-turn tests are unaffected
- HTML report still opens and renders correctly
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededvisualizationCharts, graphs and reportingCharts, graphs and reporting