PRD — OpenCode Browser Dual Backend (Native + Agent)

Summary

OpenCode Browser should feel like a single browser automation capability that can run against two backends: the native OpenCode Browser extension (real Chrome profile) and agent-browser (headless Playwright). The default experience must work with agent-browser out of the box, while preserving the native-extension path for authenticated, profile-driven workflows. Users should think: “I pick which browser to run on, and the task runs there.”

Problem

The plugin has two backends with different setup paths and capabilities, but users lack a clear, unified mental model for choosing between them.
Agent-browser currently requires explicit env configuration, so it is not the default path even when installed.
The system lacks a consistent way to signal which backend is active, how to switch, and what capabilities are available.

Goals

Make agent-browser the default backend when available, without requiring env flags.
Preserve the native extension backend for workflows that depend on a real browser profile.
Provide a clean mental model: “Pick the browser target, run tasks there.”
Offer clear UX affordances (CLI + UI + prompt-level) to pick or override the browser.
Gracefully degrade with explicit guidance when a backend isn’t available.

Non-Goals

Implementing new browser automation tools or APIs.
Replacing the Chrome extension architecture.
Perfect parity between backends (some feature gaps are acceptable).

Personas

Power user (desktop): Wants tasks to run in their real Chrome profile (logged in, bookmarks, extensions).
Headless automation user: Wants fast, repeatable headless runs in a clean profile.
Remote operator: Runs automation on a separate host (server or tailnet) and needs stable headless sessions.

Mental Model

There is one Browser toolset with two browser targets:
- Native Browser: Chrome extension + native host, uses the user’s live profile.
- Agent Browser: Playwright-driven, headless, isolated profile.
The user chooses which target to use for a task or session. If they don’t choose, OpenCode picks the best available default.

Key User Stories

As a user, I can choose “Native Browser” when I need my logged-in profile.
As a user, I can choose “Agent Browser” for headless, repeatable automation.
As a user, I can set a default browser once and have it persist.
As a user, I can override the default for a specific task.
As a user, I can see which backend is in use and why.
As a user, I get a clear error if a backend is unavailable and instructions to fix it.

Functional Requirements

Default Selection

Default backend mode is auto with agent-browser priority:
1. If agent-browser is installed and reachable, use agent-browser.
2. Else, if the native extension broker is reachable, use native.
3. Else, surface a “no browser available” error with setup instructions.
Explicit user choice always overrides auto-selection.

Backend Selection Surface Area

Config-level (global): persisted default (browser.backend = agent|native|auto).
Session-level: a per-session override selected in UI or CLI flags.
Task-level: prompt annotation or tool parameter (e.g., @browser agent, @browser native).

Capability Signaling

browser_status returns:
- backend: agent-browser or extension
- capabilities: profile_access, headless, tab_claims, file_uploads, downloads
- active_session: backend session identifier
UI should expose the active backend and show a short rationale if auto-selected.

Graceful Degradation

If a task requests native but extension is unavailable, return a structured error plus setup steps.
If a task requests agent but agent-browser is missing, offer installation instructions and fallback to native if user permits.

Compatibility Requirements

Existing env vars (OPENCODE_BROWSER_BACKEND, agent-browser overrides) continue to work and override auto mode.
No changes required to agent-browser CLI usage.

UX/Flows

Onboarding / First Use

User installs OpenCode Browser plugin.
OpenCode checks backend availability.
Default picks agent-browser if installed, else native extension.
UI or CLI surfaces the choice: “Using Agent Browser (headless). Switch → Native Browser.”

Switching Backends

UI: a “Browser Target” selector in Settings.
CLI: opencode run --browser=agent|native.
Prompt: @browser agent / @browser native.

Error UX

Structured error with:
- Missing backend
- How to install or enable
- Offer fallback (if allowed)

Technical Considerations (Design Only)

Introduce a backend router that encapsulates selection logic and capability reporting.
Maintain feature matrix for parity gaps (e.g., tab claims not supported on agent-browser).
Auto-detection uses lightweight health checks on both backends.

Capability Matrix (Target Behavior)

Feature	Native Browser	Agent Browser	Notes
Real profile (cookies/logins)	✅	❌	Agent is isolated
Headless	❌	✅	Agent only
Tab ownership/claims	✅	❌	Current constraint
File uploads (large)	⚠️	✅	Agent preferred
Remote host	⚠️	✅	Agent gateway
Stability / replays	✅	✅	Both expected

Analytics / Telemetry

Track backend usage (agent vs native) and selection reason (auto vs manual).
Track backend errors and fallback frequency.

Risks & Mitigations

User confusion about “two browsers” → Always show active backend in status/UX.
Backend mismatch with task expectations → Provide explicit “requires profile” hints in docs.
Agent-browser missing dependency → Clear install instructions + native fallback.

Rollout Plan

Ship auto-backend selection with agent-first priority.
Add UI/CLI controls for backend override.
Add capability matrix + status reporting in browser_status.
Collect telemetry + adjust defaults if reliability issues appear.

Open Questions

Should agent-browser be bundled or remain optional dependency?
What is the best UX surface in OpenWork for backend selection?
Do we allow per-tool overrides or only per-session?
How should “tab claims” limitations be surfaced to users?
What is the policy for auto-fallback (silent vs explicit)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PRD — OpenCode Browser Dual Backend (Native + Agent)

Summary

Problem

Goals

Non-Goals

Personas

Mental Model

Key User Stories

Functional Requirements

Default Selection

Backend Selection Surface Area

Capability Signaling

Graceful Degradation

Compatibility Requirements

UX/Flows

Onboarding / First Use

Switching Backends

Error UX

Technical Considerations (Design Only)

Capability Matrix (Target Behavior)

Analytics / Telemetry

Risks & Mitigations

Rollout Plan

Open Questions

FilesExpand file tree

PRD.md

Latest commit

History

PRD.md

File metadata and controls

PRD — OpenCode Browser Dual Backend (Native + Agent)

Summary

Problem

Goals

Non-Goals

Personas

Mental Model

Key User Stories

Functional Requirements

Default Selection

Backend Selection Surface Area

Capability Signaling

Graceful Degradation

Compatibility Requirements

UX/Flows

Onboarding / First Use

Switching Backends

Error UX

Technical Considerations (Design Only)

Capability Matrix (Target Behavior)

Analytics / Telemetry

Risks & Mitigations

Rollout Plan

Open Questions