Skip to content

Conversation

@drewdrewthis
Copy link
Contributor

Summary

Adds documentation for the new On-Platform Scenarios feature (M1) that allows users to create and run agent simulations directly in the LangWatch UI without writing code.

Changes

New Pages

Navigation Updates

Reorganized Agent Simulations section into two sub-groups:

  • On-Platform Scenarios (new visual authoring)
  • Scenario SDK (existing code-based approach)

Screenshots Needed

The documentation references placeholder images that need to be captured:

  • scenario-library.png - Scenario library page
  • scenario-editor.png - Editor with Situation and Criteria
  • criteria-list.png - Criteria list in editor
  • target-drawer.png - Target configuration drawer
  • http-target-form.png - HTTP target configuration
  • llm-target-form.png - LLM target configuration
  • prompt-config-form.png - Prompt Config target configuration
  • quick-run.png - Quick run button/execution
  • run-visualizer.png - Full run visualizer view
  • criteria-results.png - Criteria pass/fail breakdown
  • run-history.png - Run history list

Related

Test plan

  • Verify all navigation links work in Mintlify preview
  • Add screenshots once M1 UI is complete
  • Review with stakeholders for accuracy

🤖 Generated with Claude Code

drewdrewthis and others added 4 commits January 14, 2026 22:38
Add documentation for the new Scenarios feature (M1) that allows users to
create and run agent simulations directly on the LangWatch platform.

New pages:
- scenarios/overview.mdx - Introduction and key concepts
- scenarios/creating-scenarios.mdx - Creating and editing scenarios
- scenarios/targets.mdx - HTTP, LLM, and Prompt Config targets
- scenarios/running-scenarios.mdx - Executing and analyzing runs

Updates navigation to organize Agent Simulations into:
- On-Platform Scenarios (new visual authoring)
- Scenario SDK (existing code-based approach)

Note: Screenshots needed - placeholder image references included.

Closes langwatch/langwatch#1094

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Key changes:
- Remove separate targets.mdx - consolidate into running-scenarios.mdx
- Fix target types: HTTP Agent and Prompt (not LLM + Prompt Config)
- Clarify scenarios vs simulations terminology
- Add comprehensive "Writing Good Criteria" guidance
- Update agent-simulations introduction with On-Platform vs SDK comparison
- Reference target selector as unified dropdown, not separate forms
- Document Save and Run flow with target memory

The documentation now accurately reflects the M1 implementation where:
- Targets are HTTP Agents or Prompts (not a separate "LLM" type)
- Target selection is via unified TargetSelector dropdown
- Results flow to existing Simulations visualizer

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Terminology fixes:
- Change "SDK" to "library" or "testing library" throughout
- Scenario is a "testing framework/library", not an SDK
- Update navigation group from "Scenario SDK" to "Scenario Library"

Content improvements:
- Clarify platform vs code-based evaluation capabilities
- On-Platform: evaluates conversation transcript only
- Code-based: can access execution traces via OpenTelemetry
- Add examples of trace-based criteria (tool calls, latency, errors)
- Point users to Scenario library for advanced trace-based evaluation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Create dedicated Agents documentation section (similar to Prompt Management):
- agents/overview.mdx - Overview of agent types and concepts
- agents/http-agents.mdx - Full HTTP agent configuration guide

Refactor running-scenarios.mdx:
- Remove inline HTTP agent configuration details
- Reference /agents/http-agents for configuration
- Keep focused on the workflow of running scenarios

Add Agents group to navigation under Agent Simulations.

This mirrors how Prompt Management is structured - detailed configuration
in its own section, referenced from scenario running docs.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants