Add Browser Env Integration #732

filip-michalsky · 2026-01-15T11:46:19Z

Description

Adds BrowserEnv - a unified browser automation integration for the verifiers library supporting two operational modes:

DOM Mode (mode="dom")

Uses the Stagehand Python SDK for natural language browser control
Tools: navigate, observe, act, extract - Stagehand's AI-driven primitives
Ideal for tasks that benefit from semantic understanding of page elements

CUA Mode (mode="cua")

Vision-based primitives for Computer Use Agent workflows
Tools: click, double_click, type_text, keypress, scroll, goto, back, forward, wait, screenshot
Automatic sandbox deployment - CUA server is deployed automatically to sandbox containers
Three execution modes (fastest to most flexible):
1. Pre-built image (default): Uses deepdream19/cua-server:latest for ~5-10s startup
2. Binary upload: Builds and uploads custom server version (~30-60s startup)
3. Manual server: Connect to locally running CUA server for development

Both modes support local browser execution or Browserbase cloud infrastructure.

What's included:

verifiers/envs/integrations/browser_env/ - Core integration (BrowserEnv, DOMMode, CUAMode, CUASandboxMode)
assets/templates/browserbase/cua/ - TypeScript CUA server with Docker build/runtime configs
environments/browser_dom_example/ - Minimal DOM mode example
environments/browser_cua_example/ - Minimal CUA mode example
New [browser] extra: uv add 'verifiers[browser]'

Benchmarks (GAIA, WebVoyager, Mind2Web) have been pushed to Prime Hub under the browserbase/ namespace.

Type of Change

New feature (non-breaking change which adds functionality)

Testing

# DOM mode
prime eval run browser-dom-example -m openai/gpt-4o-mini

# CUA mode (pre-built image - default, fastest)
prime eval run browser-cua-example -m openai/gpt-4o-mini

# CUA mode (binary upload - custom server)
prime eval run browser-cua-example -m openai/gpt-4o-mini -a '{"use_prebuilt_image": false}'

# CUA mode (manual server for development)
cd assets/templates/browserbase/cua && pnpm dev  # In separate terminal
prime eval run browser-cua-example -m openai/gpt-4o-mini -a '{"use_sandbox": false}'

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

CUA Server Deployment Options:

Mode	Flag	Startup	Use Case
Pre-built image (default)	None	~5-10s	Production
Binary upload	`use_prebuilt_image=false`	~30-60s	Custom server
Manual server	`use_sandbox=false`	Instant	Development

Future work:

Additional benchmark environments available on Prime Hub under browserbase/ org


---

## Bugbot Summary (updated)

```markdown

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces a unified browser automation environment with two modes and a packaged CUA server.
> 
> - Adds `BrowserEnv` with **DOM** (Stagehand SDK: `navigate/observe/act/extract`) and **CUA** (vision primitives: `click/type/scroll/goto/...`) modes
> - CUA mode supports automatic sandbox deployment with options: pre-built Docker image (default), binary upload, or manual local server
> - Implements TypeScript Fastify CUA server (`assets/templates/browserbase/cua/`) with REST endpoints, Dockerfiles, build/publish scripts, and SEA binary tooling
> - Provides example environments (`browser_dom_example`, `browser_cua_example`), docs updates, and a new `[browser]` extra dependency set
> - Adds tests for validation, modes, prompt defaults, response formatting, and example datasets; updates exports to expose `BrowserEnv`
> 
> <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 38d8ae167778f2e87b708c3f57dec329d010b337. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

CLAassistant · 2026-01-15T11:46:34Z

All committers have signed the CLA.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

cursor