[2/3] Add HarnessEnvironment with multi-turn support (RFC 005) by Darktex · Pull Request #390 · meta-pytorch/OpenEnv

Darktex · 2026-02-17T08:38:35Z

Summary

Stacked on #389 (foundation types).

Implements HarnessEnvironment, which wraps an external agentic harness with OpenEnv's Gym-style API.

Key semantics

reset(): Stops any running harness, injects MCP tools, starts a fresh process and conversation
step(HarnessAction): Sends one conversational turn to the harness. The harness does its ReAct loop (potentially many LLM calls and tool invocations) and returns when it has a response
Multi-turn: The harness maintains conversation context across step() calls. Multiple steps form a conversation within one episode
done signal: Propagated from HarnessResponse.done to Observation.done
Trajectory: Events accumulated across all turns, accessible via .trajectory property
close(): Cleans up the harness process

Files

src/openenv/core/harnesses/environment.py - HarnessEnvironment implementation
tests/core/test_harnesses/test_harness_environment.py - 26 tests with MockHarnessAdapter

Test plan

26 tests covering reset, multi-turn step, trajectory, state, close, MCP injection
69 total harness tests pass (types + environment)
No regressions in MCP tests
Lint clean

Part of #385

greptile-apps · 2026-02-17T08:41:53Z

Greptile Summary

This PR implements HarnessEnvironment, a new Environment subclass that wraps external agentic harnesses (OpenClaw, Claude Code, etc.) with OpenEnv's Gym-style API, as specified in RFC 005. Each step() is one conversational turn, and the harness maintains conversation context across turns within an episode.

Core reset()/step()/close() lifecycle is well-implemented with proper trajectory accumulation, state management, and done-signal propagation from the harness
Bug in _get_mcp_tool_definitions: The method splits fastmcp.Client's async context manager (__aenter__, list_tools, __aexit__) across three separate run_async_safely() calls, each of which creates a new event loop — the client state won't persist across these boundaries. The rest of the codebase uses a single async with block (see MCPEnvironment._async_list_tools). This needs to be fixed before merge
Silent except Exception: return [] in _get_mcp_tool_definitions will hide failures during MCP tool injection, making debugging difficult
Test suite is comprehensive (26 tests) with a well-designed MockHarnessAdapter

ALIGNMENT FLAG: RFC 005 specifies HarnessEnvironment(MCPEnvironment) but implementation uses HarnessEnvironment(Environment)

Principle at stake: RFC adherence — key decisions are documented in RFCs and should not be changed without discussion
The concern: The base class change may be intentional (since harness handles MCP differently than direct tool serving), but the deviation from RFC 005's design section should be explicitly acknowledged
Suggested reviewer: @Darktex

Confidence Score: 3/5

The core harness lifecycle is sound, but the MCP tool injection method has a bug that will cause runtime failures when MCP tools are present.
Score of 3 reflects that the main environment logic (reset/step/close/trajectory) is correct and well-tested, but _get_mcp_tool_definitions has a concrete bug (async context manager split across event loops) that will break MCP tool injection at runtime. The bug is in a non-default code path (only triggered when mcp is provided), so the core functionality works. Alignment question about base class deviation from RFC is also unresolved.
src/openenv/core/harnesses/environment.py — specifically the _get_mcp_tool_definitions method (lines 154-167)

Important Files Changed

Filename	Overview
src/openenv/core/harnesses/init.py	Adds `HarnessEnvironment` to the package exports. Clean and straightforward change.
src/openenv/core/harnesses/environment.py	New HarnessEnvironment implementation. `_get_mcp_tool_definitions` has a bug: it splits an async context manager (`__aenter__`/`list_tools`/`__aexit__`) across separate event loops via `run_async_safely`, which will fail at runtime. Silent exception swallowing hides errors. The rest of the implementation (reset, step, close, trajectory) is clean and well-structured.
tests/core/test_harnesses/test_harness_environment.py	Comprehensive test suite with 26 tests covering reset, step, multi-turn, trajectory, state, close, and MCP integration. Good use of MockHarnessAdapter. The MCP injection test (`test_mcp_tools_injected_on_reset`) only asserts `is not None` rather than checking actual tool contents, which is weak but non-blocking.

Sequence Diagram

sequenceDiagram
    participant TL as Training Loop
    participant HE as HarnessEnvironment
    participant HA as HarnessAdapter
    participant H as Harness (ReAct Loop)

    TL->>HE: reset(episode_id)
    HE->>HA: is_alive()
    HA-->>HE: true/false
    opt if alive
        HE->>HA: stop()
    end
    opt if MCP server present
        HE->>HE: _get_mcp_tool_definitions()
        HE->>HA: inject_tools(tools)
    end
    HE->>HA: start(working_directory)
    HE-->>TL: Observation(done=false)

    TL->>HE: step(HarnessAction("Fix the bug"))
    HE->>HA: send_message("Fix the bug")
    HA->>H: message
    H->>H: LLM calls + tool invocations
    H-->>HA: HarnessResponse(events, done=false)
    HA-->>HE: HarnessResponse
    HE->>HE: accumulate trajectory
    HE-->>TL: Observation(response, turn_events, done=false)

    TL->>HE: step(HarnessAction("Tests still failing"))
    HE->>HA: send_message("Tests still failing")
    HA->>H: message
    H->>H: LLM calls + tool invocations
    H-->>HA: HarnessResponse(events, done=true)
    HA-->>HE: HarnessResponse
    HE->>HE: accumulate trajectory
    HE-->>TL: Observation(response, turn_events, done=true)

    TL->>HE: close()
    HE->>HA: is_alive()
    HA-->>HE: true
    HE->>HA: stop()

_{Last reviewed commit: bf89368}

greptile-apps

_{3 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

src/openenv/core/harnesses/environment.py

Implements HarnessEnvironment, which wraps an external agentic harness with OpenEnv's Gym-style API: - reset() stops any running harness, injects MCP tools, starts fresh - step(HarnessAction) sends one conversational turn to the harness - Harness maintains conversation context across step() calls - done signal propagated from harness to observation - Trajectory accumulated across turns, accessible via .trajectory - close() cleans up harness process 26 tests covering reset, multi-turn step, trajectory accumulation, state management, close behavior, and MCP tool injection. Part of #385

- Fix broken async context manager in _get_mcp_tool_definitions: use single async function with 'async with' instead of separate run_async_safely calls for __aenter__/__aexit__ - Add warning log on MCP tool extraction failure instead of silently swallowing exceptions

* Add OpenClaw adapter implementation (RFC 005) Concrete HarnessAdapter for the OpenClaw agentic platform: - Process lifecycle: start/stop via asyncio subprocess - MCP tool injection: writes mcpServers config to openclaw.json, merges with existing config entries - Communication: JSON-line protocol over stdin/stdout - Event extraction: parses tool_calls from responses into HarnessEvents - Streaming: yields events from turn response - Error handling: timeout detection, plain-text fallback 18 tests covering imports, config injection, process lifecycle, message sending with JSON/plain-text responses, tool call extraction, streaming, and HarnessEnvironment integration. Part of #385 * Address review feedback: move import os to top of file * Fix env variable bug and add missing test coverage - Fix critical bug: env vars now merge with parent env (os.environ) instead of replacing it, preserving PATH, PYTHONPATH, etc. - When no env_vars or api_key configured, pass None to inherit parent - Add test for parent env inheritance with custom env vars - Add test for None env when no overrides configured - Add test for kill path when terminate times out - Add test for send_message timeout handling - Add test for corrupted config file handling - Add subprocess pipe parameter assertions to start test

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 17, 2026

Darktex mentioned this pull request Feb 17, 2026

[3/3] Add OpenClaw adapter (RFC 005) #391

Merged

4 tasks

greptile-apps bot reviewed Feb 17, 2026

View reviewed changes

src/openenv/core/harnesses/environment.py Show resolved Hide resolved

src/openenv/core/harnesses/environment.py Outdated Show resolved Hide resolved

Darktex force-pushed the feature/issue-385-harness-environment branch 2 times, most recently from 37a0876 to d8b1362 Compare February 17, 2026 21:50

Darktex changed the title ~~Add HarnessEnvironment with multi-turn support (RFC 005)~~ [3/4] Add HarnessEnvironment with multi-turn support (RFC 005) Feb 17, 2026

Darktex added 2 commits February 17, 2026 23:05

Darktex changed the title ~~[3/4] Add HarnessEnvironment with multi-turn support (RFC 005)~~ [2/3] Add HarnessEnvironment with multi-turn support (RFC 005) Feb 18, 2026

Darktex force-pushed the feature/issue-385-harness-types branch from a993c0c to 2408704 Compare February 18, 2026 07:06

Darktex force-pushed the feature/issue-385-harness-environment branch from d8b1362 to f9f77c6 Compare February 18, 2026 07:06

Darktex merged commit 49e9fb6 into feature/issue-385-harness-types Feb 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2/3] Add HarnessEnvironment with multi-turn support (RFC 005)#390

[2/3] Add HarnessEnvironment with multi-turn support (RFC 005)#390
Darktex merged 3 commits intofeature/issue-385-harness-typesfrom
feature/issue-385-harness-environment

Darktex commented Feb 17, 2026

Uh oh!

greptile-apps bot commented Feb 17, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Darktex commented Feb 17, 2026

Summary

Key semantics

Files

Test plan

Uh oh!

greptile-apps bot commented Feb 17, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant