Skip to content

[1/3] Add harness foundation types (RFC 005)#389

Open
Darktex wants to merge 4 commits intomainfrom
feature/issue-385-harness-types
Open

[1/3] Add harness foundation types (RFC 005)#389
Darktex wants to merge 4 commits intomainfrom
feature/issue-385-harness-types

Conversation

@Darktex
Copy link
Contributor

@Darktex Darktex commented Feb 17, 2026

Summary

Stacked on #387 (RFC 005).

Introduces the core type system for agentic harness integration:

  • HarnessConfig - Pydantic model for configuring harness processes
  • HarnessTransport - Enum: stdio, streamable HTTP, MCP
  • HarnessAdapter - ABC for harness-specific lifecycle (start, stop, inject_tools, send_message, send_message_streaming)
  • HarnessEvent / HarnessEventType - Standard event schema for turn trajectories (8 event types)
  • HarnessResponse - Complete turn response with events list and done signal
  • HarnessAction - Action type extending Action for sending messages to harnesses
  • resolve_tool_conflicts() - Utility for detecting and resolving tool name collisions via env_ prefixing

Test plan

  • 43 unit tests covering all types, serialization roundtrips, validation errors, edge cases
  • No regressions in existing MCP and type tests (155 tests pass)
  • Lint clean (ruff format + check)

Part of #385

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 17, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 17, 2026

Greptile Summary

This PR implements PR 2 from RFC 005's implementation plan: the foundation type system for agentic harness integration. It introduces Pydantic models (HarnessConfig, HarnessEvent, HarnessResponse, HarnessAction), enums (HarnessTransport, HarnessEventType), an abstract base class (HarnessAdapter), and a tool conflict resolution utility (resolve_tool_conflicts). These types define the contract that concrete harness adapters (OpenClaw, Claude Code, etc.) will implement in subsequent PRs.

The changes are additive and self-contained — no existing code is modified. The types align well with the project's existing Pydantic-based type system and follow the same ConfigDict(extra="forbid") pattern used throughout env_server/types.py.

  • All new types live in src/openenv/core/harnesses/, cleanly separated from existing modules
  • HarnessAction correctly extends the base Action class with a type discriminator field, consistent with ListToolsAction and CallToolAction in mcp_types.py
  • HarnessAdapter ABC defines the lifecycle interface (start, stop, inject_tools, send_message, send_message_streaming, is_alive) that concrete adapters must implement
  • resolve_tool_conflicts() handles env_ prefixing for name collisions, though the RFC's "error on ambiguity" behavior (raising errors for same-name-different-schema cases) is not yet implemented — likely deferred to the core implementation PR
  • inject_tools on HarnessAdapter uses a bare List without type parameter, losing type information for implementers
  • Timeout fields on HarnessConfig lack gt=0 validation, unlike existing timeout fields in the codebase

Alignment Review Report

Automated Checks

  • Lint: SKIPPED — uv not available in review environment
  • Debug code: CLEAN — no debugger statements found in new files

Tier 1: Fixes Required

  • None identified

Tier 2: Alignment Discussion

None identified — the types are purely additive, respect the dual API boundary (harness messages go through step(), not exposed via MCP), and follow the RFC 005 design faithfully.

Confidence Score: 4/5

  • This PR is safe to merge — it adds new types with no modifications to existing code and no alignment violations.
  • Score of 4 reflects that this is a clean, additive foundation PR with thorough test coverage (43 tests). The two style suggestions (untyped List, missing timeout validation) are minor and don't affect correctness. No architectural invariants are violated. Deducted 1 point for the type safety gap in the abstract interface that downstream implementers will depend on.
  • src/openenv/core/harnesses/adapter.pyinject_tools bare List type should be parameterized before concrete adapters depend on it

Important Files Changed

Filename Overview
src/openenv/core/harnesses/types.py Defines core Pydantic models and enums (HarnessTransport, HarnessEventType, HarnessEvent, HarnessResponse, HarnessConfig, HarnessAction). Well-structured and follows existing codebase patterns. Minor: timeout fields lack gt=0 validation.
src/openenv/core/harnesses/adapter.py ABC for harness adapters with well-documented abstract methods. inject_tools uses bare List type without parameterization, losing type safety for implementers.
src/openenv/core/harnesses/tools.py Tool conflict resolution utility. Clean implementation using model_copy. Correctly uses List[Tool]. RFC mentions "error on ambiguity" for same-name-different-schema cases, not yet implemented (may be future work).
src/openenv/core/harnesses/init.py Clean module init re-exporting all public types via __all__. No issues.
tests/core/test_harnesses/init.py Empty test package init with copyright header. No issues.
tests/core/test_harnesses/test_harness_types.py Thorough test suite covering all types, validation, serialization roundtrips, ABC enforcement, and module exports. Good coverage of edge cases.

Class Diagram

classDiagram
    class Action {
        +Dict metadata
    }
    class HarnessAction {
        +Literal type = "harness_message"
        +str message
    }
    Action <|-- HarnessAction

    class HarnessTransport {
        <<enum>>
        STDIO = "stdio"
        STREAMABLE_HTTP = "http"
        MCP = "mcp"
    }

    class HarnessEventType {
        <<enum>>
        LLM_REQUEST
        LLM_RESPONSE
        LLM_CHUNK
        TOOL_CALL
        TOOL_RESULT
        TEXT_OUTPUT
        ERROR
        TURN_COMPLETE
    }

    class HarnessEvent {
        +HarnessEventType type
        +float timestamp
        +Dict data
    }
    HarnessEvent --> HarnessEventType

    class HarnessResponse {
        +str response
        +List~HarnessEvent~ events
        +bool done
    }
    HarnessResponse --> HarnessEvent

    class HarnessConfig {
        +str name
        +List~str~ command
        +str working_directory
        +Dict env_vars
        +HarnessTransport transport
        +Optional~str~ mcp_config_path
        +float startup_timeout_s
        +float session_timeout_s
        +Optional~str~ model
        +Optional~str~ api_key_env_var
    }
    HarnessConfig --> HarnessTransport

    class HarnessAdapter {
        <<abstract>>
        +HarnessConfig config
        +start(working_directory) None
        +stop() None
        +inject_tools(tools) None
        +send_message(message) HarnessResponse
        +send_message_streaming(message) AsyncIterator~HarnessEvent~
        +is_alive() bool
    }
    HarnessAdapter --> HarnessConfig
    HarnessAdapter --> HarnessResponse
    HarnessAdapter --> HarnessEvent

    class Tool {
        +str name
        +str description
        +Dict input_schema
    }

    class resolve_tool_conflicts {
        <<function>>
        +resolve(env_tools, harness_builtin_tools) List~Tool~
    }
    resolve_tool_conflicts --> Tool
Loading

Last reviewed commit: 0c5d074

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@Darktex Darktex changed the title Add harness foundation types (RFC 005) [2/4] Add harness foundation types (RFC 005) Feb 17, 2026
Introduces the core type system for agentic harness integration:

- HarnessConfig: Pydantic model for harness process configuration
- HarnessTransport: Enum for communication transport (stdio, http, mcp)
- HarnessAdapter: ABC for harness-specific lifecycle management
- HarnessEvent/HarnessEventType: Standard event schema for trajectories
- HarnessResponse: Complete turn response with events and done signal
- HarnessAction: Action type for sending messages to harnesses
- resolve_tool_conflicts(): Utility for tool name collision handling

43 tests covering all types, serialization, validation, and edge cases.

Part of #385
- Add List[Tool] type parameter to HarnessAdapter.inject_tools()
- Add gt=0 validation to timeout fields in HarnessConfig
@Darktex Darktex changed the title [2/4] Add harness foundation types (RFC 005) [1/3] Add harness foundation types (RFC 005) Feb 18, 2026
@Darktex Darktex changed the base branch from feature/issue-385-agentic-harnesses to main February 18, 2026 07:05
@Darktex Darktex force-pushed the feature/issue-385-harness-types branch from a993c0c to 2408704 Compare February 18, 2026 07:06
* Add HarnessEnvironment with multi-turn support (RFC 005)

Implements HarnessEnvironment, which wraps an external agentic harness
with OpenEnv's Gym-style API:

- reset() stops any running harness, injects MCP tools, starts fresh
- step(HarnessAction) sends one conversational turn to the harness
- Harness maintains conversation context across step() calls
- done signal propagated from harness to observation
- Trajectory accumulated across turns, accessible via .trajectory
- close() cleans up harness process

26 tests covering reset, multi-turn step, trajectory accumulation,
state management, close behavior, and MCP tool injection.

Part of #385

* Address review feedback: fix async context manager and logging

- Fix broken async context manager in _get_mcp_tool_definitions:
  use single async function with 'async with' instead of separate
  run_async_safely calls for __aenter__/__aexit__
- Add warning log on MCP tool extraction failure instead of
  silently swallowing exceptions

* [3/3] Add OpenClaw adapter (RFC 005) (#391)

* Add OpenClaw adapter implementation (RFC 005)

Concrete HarnessAdapter for the OpenClaw agentic platform:

- Process lifecycle: start/stop via asyncio subprocess
- MCP tool injection: writes mcpServers config to openclaw.json,
  merges with existing config entries
- Communication: JSON-line protocol over stdin/stdout
- Event extraction: parses tool_calls from responses into HarnessEvents
- Streaming: yields events from turn response
- Error handling: timeout detection, plain-text fallback

18 tests covering imports, config injection, process lifecycle,
message sending with JSON/plain-text responses, tool call extraction,
streaming, and HarnessEnvironment integration.

Part of #385

* Address review feedback: move import os to top of file

* Fix env variable bug and add missing test coverage

- Fix critical bug: env vars now merge with parent env (os.environ)
  instead of replacing it, preserving PATH, PYTHONPATH, etc.
- When no env_vars or api_key configured, pass None to inherit parent
- Add test for parent env inheritance with custom env vars
- Add test for None env when no overrides configured
- Add test for kill path when terminate times out
- Add test for send_message timeout handling
- Add test for corrupted config file handling
- Add subprocess pipe parameter assertions to start test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant