Skip to content

Define episode lifecycle protocol (start, end, restart, chaining) #81

@justinmadison

Description

@justinmadison

Problem

With the compiled executable (#77) and cross-episode learning (#76), several episode lifecycle questions are unanswered. Currently:

  • SPACE starts/stops simulation (goes away with auto-start)
  • R resets the scene
  • Episode boundaries are detected ad-hoc by tick resets in EpisodeMemoryManager
  • No formal protocol for episode end conditions
  • No way to chain episodes automatically

For persistent cross-episode memory (#76) to work, and for the compiled executable (#77) to feel polished, the episode lifecycle needs a clear protocol that both Godot and Python agree on.

Questions to Answer

1. How does an episode END?

Define explicit end conditions per scenario:

  • Foraging: All resources collected? Health reaches 0? Tick limit (e.g., 500 ticks)?
  • Crafting chain: All recipes crafted? Tick limit?
  • Team capture: All points held? Time limit?

Currently foraging.gd has MAX_RESOURCES=7 but no end-on-completion logic.

2. How does Godot signal episode end to Python?

Options:

  • Special field in observation: {"episode_complete": true, "reason": "all_collected", "score": 85}
  • Dedicated endpoint: POST /episode_end from Godot → Python
  • Observation stops arriving (implicit — fragile)

3. How does a new episode START?

  • Auto-restart after N seconds?
  • Python requests restart via POST /reset?
  • User presses a key in the game window?
  • Configurable: --episodes 5 --delay-between 3

4. How are episodes numbered/tracked?

For #76's persistent memory, each episode needs an ID:

  • Sequential: episode_1, episode_2, ...
  • Timestamped: episode_20260223_143022
  • Both Godot and Python must agree on the current episode ID

5. Can episodes chain automatically?

For the learning progression demo (#76), users want to run 5+ episodes and watch improvement:

python run.py --scenario foraging --episodes 5

This requires: auto-restart, episode boundary signaling, and Python-side orchestration.

Proposed Protocol

Episode State Machine

WAITING → RUNNING → ENDED → (auto) WAITING
   |         |         |
   |    tick_advanced   |
   |    observations    |
   |    tool_calls      |
   |         |          |
   |    end_condition   |
   |    met             |
   |         |          |
   |         v          |
   |      ENDED --------+--→ save persistent memory
   |         |               generate episode summary
   |         v
   +---- WAITING (reset scene, increment episode_id)

IPC Messages

Godot → Python (episode end):
Include in final observation or as a separate message:

{
  "episode_ended": true,
  "episode_id": 3,
  "reason": "objective_complete",
  "final_score": 85,
  "ticks_elapsed": 247,
  "metrics": {
    "resources_collected": 7,
    "damage_taken": 15,
    "distance_traveled": 142.5,
    "exploration_pct": 0.78
  }
}

Python → Godot (episode control):
New endpoints or CLI args:

POST /reset          — restart current scenario
POST /configure      — set episode params (tick_limit, auto_restart)
--episodes N         — run N episodes then quit
--tick-limit 500     — max ticks per episode

Godot-Side Changes

  • Add end conditions to base_scene_controller.gd (configurable per scenario)
  • Add episode_id tracking (increment on reset)
  • Add auto-restart logic (configurable delay)
  • Include episode metadata in observations

Python-Side Changes

Acceptance Criteria

  • Foraging scene has explicit end conditions (objective met OR tick limit)
  • Godot signals episode end to Python with score and metrics
  • Python can request scene reset
  • Episodes have unique IDs agreed upon by both sides
  • --episodes 5 runs 5 consecutive episodes with automatic restarts
  • Episode lifecycle hooks available in SDK for persistent memory (Implement persistent cross-episode memory for agent learning across runs #76)
  • Protocol documented for scenario developers

Estimated Effort

1-2 days

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions