-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Problem
With the compiled executable (#77) and cross-episode learning (#76), several episode lifecycle questions are unanswered. Currently:
- SPACE starts/stops simulation (goes away with auto-start)
- R resets the scene
- Episode boundaries are detected ad-hoc by tick resets in
EpisodeMemoryManager - No formal protocol for episode end conditions
- No way to chain episodes automatically
For persistent cross-episode memory (#76) to work, and for the compiled executable (#77) to feel polished, the episode lifecycle needs a clear protocol that both Godot and Python agree on.
Questions to Answer
1. How does an episode END?
Define explicit end conditions per scenario:
- Foraging: All resources collected? Health reaches 0? Tick limit (e.g., 500 ticks)?
- Crafting chain: All recipes crafted? Tick limit?
- Team capture: All points held? Time limit?
Currently foraging.gd has MAX_RESOURCES=7 but no end-on-completion logic.
2. How does Godot signal episode end to Python?
Options:
- Special field in observation:
{"episode_complete": true, "reason": "all_collected", "score": 85} - Dedicated endpoint:
POST /episode_endfrom Godot → Python - Observation stops arriving (implicit — fragile)
3. How does a new episode START?
- Auto-restart after N seconds?
- Python requests restart via
POST /reset? - User presses a key in the game window?
- Configurable:
--episodes 5 --delay-between 3
4. How are episodes numbered/tracked?
For #76's persistent memory, each episode needs an ID:
- Sequential: episode_1, episode_2, ...
- Timestamped: episode_20260223_143022
- Both Godot and Python must agree on the current episode ID
5. Can episodes chain automatically?
For the learning progression demo (#76), users want to run 5+ episodes and watch improvement:
python run.py --scenario foraging --episodes 5This requires: auto-restart, episode boundary signaling, and Python-side orchestration.
Proposed Protocol
Episode State Machine
WAITING → RUNNING → ENDED → (auto) WAITING
| | |
| tick_advanced |
| observations |
| tool_calls |
| | |
| end_condition |
| met |
| | |
| v |
| ENDED --------+--→ save persistent memory
| | generate episode summary
| v
+---- WAITING (reset scene, increment episode_id)
IPC Messages
Godot → Python (episode end):
Include in final observation or as a separate message:
{
"episode_ended": true,
"episode_id": 3,
"reason": "objective_complete",
"final_score": 85,
"ticks_elapsed": 247,
"metrics": {
"resources_collected": 7,
"damage_taken": 15,
"distance_traveled": 142.5,
"exploration_pct": 0.78
}
}Python → Godot (episode control):
New endpoints or CLI args:
POST /reset — restart current scenario
POST /configure — set episode params (tick_limit, auto_restart)
--episodes N — run N episodes then quit
--tick-limit 500 — max ticks per episode
Godot-Side Changes
- Add end conditions to
base_scene_controller.gd(configurable per scenario) - Add
episode_idtracking (increment on reset) - Add auto-restart logic (configurable delay)
- Include episode metadata in observations
Python-Side Changes
- SDK
AgentArenaclass gains episode lifecycle hooks:on_episode_start(episode_id)on_episode_end(episode_id, summary)
- Adapter base class (Add framework adapter system for LangGraph, Claude Agent SDK, and other agent frameworks #74) exposes these hooks to frameworks
--episodes Nflag in run.py for batch execution
Acceptance Criteria
- Foraging scene has explicit end conditions (objective met OR tick limit)
- Godot signals episode end to Python with score and metrics
- Python can request scene reset
- Episodes have unique IDs agreed upon by both sides
-
--episodes 5runs 5 consecutive episodes with automatic restarts - Episode lifecycle hooks available in SDK for persistent memory (Implement persistent cross-episode memory for agent learning across runs #76)
- Protocol documented for scenario developers
Estimated Effort
1-2 days
Dependencies
- Related to Add tool completion callbacks from Godot to Python #71 (tool callbacks — similar IPC extension pattern)
- Blocks Implement persistent cross-episode memory for agent learning across runs #76 (persistent memory needs episode boundaries)
- Related to Compiled game executable with scenario launcher (no Godot IDE for users) #77 (compiled exe needs auto-start/restart)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status