[2/3] Add HarnessEnvironment with multi-turn support (RFC 005)#390
Merged
Darktex merged 3 commits intofeature/issue-385-harness-typesfrom Feb 19, 2026
Merged
Conversation
4 tasks
Contributor
Greptile SummaryThis PR implements
ALIGNMENT FLAG: RFC 005 specifies
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant TL as Training Loop
participant HE as HarnessEnvironment
participant HA as HarnessAdapter
participant H as Harness (ReAct Loop)
TL->>HE: reset(episode_id)
HE->>HA: is_alive()
HA-->>HE: true/false
opt if alive
HE->>HA: stop()
end
opt if MCP server present
HE->>HE: _get_mcp_tool_definitions()
HE->>HA: inject_tools(tools)
end
HE->>HA: start(working_directory)
HE-->>TL: Observation(done=false)
TL->>HE: step(HarnessAction("Fix the bug"))
HE->>HA: send_message("Fix the bug")
HA->>H: message
H->>H: LLM calls + tool invocations
H-->>HA: HarnessResponse(events, done=false)
HA-->>HE: HarnessResponse
HE->>HE: accumulate trajectory
HE-->>TL: Observation(response, turn_events, done=false)
TL->>HE: step(HarnessAction("Tests still failing"))
HE->>HA: send_message("Tests still failing")
HA->>H: message
H->>H: LLM calls + tool invocations
H-->>HA: HarnessResponse(events, done=true)
HA-->>HE: HarnessResponse
HE->>HE: accumulate trajectory
HE-->>TL: Observation(response, turn_events, done=true)
TL->>HE: close()
HE->>HA: is_alive()
HA-->>HE: true
HE->>HA: stop()
Last reviewed commit: bf89368 |
37a0876 to
d8b1362
Compare
Implements HarnessEnvironment, which wraps an external agentic harness with OpenEnv's Gym-style API: - reset() stops any running harness, injects MCP tools, starts fresh - step(HarnessAction) sends one conversational turn to the harness - Harness maintains conversation context across step() calls - done signal propagated from harness to observation - Trajectory accumulated across turns, accessible via .trajectory - close() cleans up harness process 26 tests covering reset, multi-turn step, trajectory accumulation, state management, close behavior, and MCP tool injection. Part of #385
- Fix broken async context manager in _get_mcp_tool_definitions: use single async function with 'async with' instead of separate run_async_safely calls for __aenter__/__aexit__ - Add warning log on MCP tool extraction failure instead of silently swallowing exceptions
a993c0c to
2408704
Compare
d8b1362 to
f9f77c6
Compare
* Add OpenClaw adapter implementation (RFC 005) Concrete HarnessAdapter for the OpenClaw agentic platform: - Process lifecycle: start/stop via asyncio subprocess - MCP tool injection: writes mcpServers config to openclaw.json, merges with existing config entries - Communication: JSON-line protocol over stdin/stdout - Event extraction: parses tool_calls from responses into HarnessEvents - Streaming: yields events from turn response - Error handling: timeout detection, plain-text fallback 18 tests covering imports, config injection, process lifecycle, message sending with JSON/plain-text responses, tool call extraction, streaming, and HarnessEnvironment integration. Part of #385 * Address review feedback: move import os to top of file * Fix env variable bug and add missing test coverage - Fix critical bug: env vars now merge with parent env (os.environ) instead of replacing it, preserving PATH, PYTHONPATH, etc. - When no env_vars or api_key configured, pass None to inherit parent - Add test for parent env inheritance with custom env vars - Add test for None env when no overrides configured - Add test for kill path when terminate times out - Add test for send_message timeout handling - Add test for corrupted config file handling - Add subprocess pipe parameter assertions to start test
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on #389 (foundation types).
Implements
HarnessEnvironment, which wraps an external agentic harness with OpenEnv's Gym-style API.Key semantics
reset(): Stops any running harness, injects MCP tools, starts a fresh process and conversationstep(HarnessAction): Sends one conversational turn to the harness. The harness does its ReAct loop (potentially many LLM calls and tool invocations) and returns when it has a responsestep()calls. Multiple steps form a conversation within one episodedonesignal: Propagated fromHarnessResponse.donetoObservation.done.trajectorypropertyclose(): Cleans up the harness processFiles
src/openenv/core/harnesses/environment.py- HarnessEnvironment implementationtests/core/test_harnesses/test_harness_environment.py- 26 tests with MockHarnessAdapterTest plan
Part of #385