-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Labels
cliCommand-line interfaceCommand-line interfaceenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
Summary
evalview capture is a proxy that records real agent traffic as test YAML files — but it creates single-turn tests (input: query: ...). If a user's app makes 3 consecutive calls to the agent during one session, capture should group them into a turns: block instead of 3 separate files.
Current behaviour
3 requests to the agent → 3 separate single-turn YAML files:
tests/captured-1.yaml (input: query: "find flights NYC-Paris")
tests/captured-2.yaml (input: query: "book the cheapest")
tests/captured-3.yaml (input: query: "send me a confirmation")
Desired behaviour
Add a --session-timeout option (default: 30 s). Requests arriving within the timeout of each other are grouped into one turns: YAML:
name: captured-session-1
turns:
- query: "find flights NYC-Paris"
- query: "book the cheapest"
- query: "send me a confirmation"
expected:
tools: []
thresholds:
min_score: 70Implementation hints
- Capture logic lives in
evalview/cli.pyin thecapturecommand - Each captured request is currently written immediately — buffer them per session instead
- A session boundary = gap >
--session-timeoutseconds between requests, or Ctrl-C - On session close, if len(requests) >= 2 → write
turns:YAML; otherwise write single-turn as today - The
ConversationTurnmodel inevalview/core/types.pyis the schema to use
Files to touch
evalview/cli.py—capturecommandevalview/core/types.py— already hasConversationTurn
Acceptance criteria
-
evalview capture --session-timeout 30groups requests into turns - Single requests still produce single-turn YAML (backward compatible)
- Generated YAML is valid and passes
evalview run -
--session-timeout 0disables grouping (always single-turn)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
cliCommand-line interfaceCommand-line interfaceenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed