From aae91a40575b0ac522e6878725979f3c532b7b27 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Wed, 25 Feb 2026 13:26:00 +0000
Subject: [PATCH 1/2] Restructure README: reposition single/multi-repo as
 operating modes, improve flow
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Replace "Single-Repo (Backward Compatible)" framing with equal "Operating Modes"
  section presenting single-repo and multi-repo as first-class features
- Reorder sections for better narrative flow: Hook → Features → Quick Start →
  How It Works → Benchmark → Examples → Modes → API Reference
- Add scannable "Features" section replacing the "Why SWE-AF" list
- Add social proof stats bar near the top (95/100 benchmark, $19 cost, 400+ agents)
- Consolidate duplicate curl examples (was ~6 places, now deduplicated)
- Tighten Quick Start into Railway / Local / Docker subsections
- Fold "Adaptive Factory Control" into "How a Build Works" with control loop table
- Move benchmark and real-world examples into dedicated sections

This is a draft for review — further iteration expected on restoring
open-source runtime examples and the clean build-process bullets.

https://claude.ai/code/session_01XgBbDVMbE52cKdU1dCNEsV
---
 README.md | 601 +++++++++++++++++++++---------------------------------
 1 file changed, 236 insertions(+), 365 deletions(-)
diff --git a/README.md b/README.md
index b9a5fc5..f2790b3 100644
--- a/README.md
+++ b/README.md
@@ -14,193 +14,45 @@
 ![WorldSpace Community Developer](https://img.shields.io/badge/WorldSpace-Community%20Developer-111827?style=for-the-badge)
 [![Example PR](https://img.shields.io/badge/Example-PR%20%23179-ff6b35?style=for-the-badge&logo=github)](https://github.com/Agent-Field/agentfield/pull/179)
 
-
+**One API call → full engineering team → shipped code.**
 
 <p>
   <a href="#quick-start">Quick Start</a> •
-  <a href="#why-swe-af">Why SWE-AF</a> •
-  <a href="#in-action">In Action</a> •
-  <a href="#adaptive-factory-control">Factory Control</a> •
-  <a href="#benchmark-snapshot">Benchmark</a> •
+  <a href="#features">Features</a> •
+  <a href="#how-a-build-works">How It Works</a> •
+  <a href="#benchmark">Benchmark</a> •
+  <a href="#operating-modes">Modes</a> •
   <a href="#api-reference">API</a> •
-  <a href="docs/ARCHITECTURE.md">Architecture Doc</a>
+  <a href="docs/ARCHITECTURE.md">Architecture</a>
 </p>
 
 </div>
 
-One API call spins up a full autonomous engineering team that can scope, build, adapt, and ship complex software end to end.
-SWE-AF is a first step toward **autonomous software engineering factories**, scaling from simple goals to hard multi-issue programs with hundreds to thousands of agent invocations.
+SWE-AF spins up a coordinated fleet of AI agents — product managers, architects, coders, reviewers, testers — that scope, build, adapt, and ship software end to end. No scaffolding, no human-in-the-loop. One goal in, verified PR out.
 
 <p align="center">
   <img src="assets/banner.jpg" alt="SWE-AF autonomous engineering fleet banner" width="100%" />
 </p>
 
-## One-Call DX
-
-```bash
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d @- <<'JSON'
-{
-  "input": {
-    "goal": "Refactor and harden auth + billing flows",
-    "repo_url": "https://github.com/user/my-project",
-    "config": {
-      "runtime": "claude_code",
-      "models": {
-        "default": "sonnet",
-        "coder": "opus",
-        "qa": "opus"
-      },
-      "enable_learning": true
-    }
-  }
-}
-JSON
-```
-
-Swap `models.default` and any role key (`coder`, `qa`, `architect`, etc.) to any model your runtime supports.
-
-## Multi-Repository Workspace Support
-
-SWE-AF supports coordinated work across multiple repositories in a single build. This is useful when your project consists of a primary application plus shared libraries, monorepo sub-projects, or dependent microservices.
-
-### Use Cases
-
-- **Primary App + Shared Libraries**: Build a web application that depends on a shared utilities or SDK library.
-- **Monorepo Sub-Projects**: Coordinate changes across multiple packages in a monorepo (each repo_url points to a sub-directory or separate repo).
-- **Dependent Microservices**: When a feature spans multiple services (e.g., API + Worker Queue), define roles to orchestrate changes across boundaries.
-
-### Single-Repo (Backward Compatible)
-
-Single-repository builds work exactly as before — just use `repo_url` or `repo_path` at the top level:
-
-```bash
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d '{
-    "input": {
-      "goal": "Add JWT auth",
-      "repo_url": "https://github.com/user/my-project"
-    }
-  }'
-```
-
-### Multi-Repo Configuration
-
-Pass `config.repos` as an array of repository objects, each with `repo_url` (or `repo_path`) and a `role`:
-
-```bash
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d '{
-    "input": {
-      "goal": "Add JWT auth across API and shared-lib",
-      "config": {
-        "repos": [
-          {
-            "repo_url": "https://github.com/org/main-app",
-            "role": "primary"
-          },
-          {
-            "repo_url": "https://github.com/org/shared-lib",
-            "role": "dependency"
-          }
-        ],
-        "runtime": "claude_code",
-        "models": {
-          "default": "sonnet"
-        }
-      }
-    }
-  }'
-```
-
-**Roles:**
-- `primary`: The main application being built. Changes here drive the build; failures block progress.
-- `dependency`: Libraries or services that may be modified to support the primary repo. Failures are captured but don't block.
-
-## Autonomous Build Spotlight
-
-Rust-based Python compiler benchmark (built autonomously):
-
-| Metric                 | CPython (subprocess) | RustPython (SWE-AF)          | Improvement             |
-| ---------------------- | -------------------- | ---------------------------- | ----------------------- |
-| Steady-state execution | Baseline (~19ms)     | Optimized in-process runtime | **88.3x-602.3x faster** |
-| Geometric mean         | 1.0x baseline        | 253.8x                       | **253.8x**              |
-| Peak throughput        | ~52 ops/s            | 31,807 ops/s                 | **~612x**               |
-
-<details>
-<summary>Measurement methodology</summary>
-
-Throughput comparison measures different execution models: CPython subprocess spawn (~19ms per call → ~52 ops/s) vs RustPython pre-warmed interpreter pool (in-process). This is the real-world tradeoff the system was built to optimize — replacing repeated subprocess invocations with a persistent pool for short-snippet execution.
-
-</details>
-
-Artifact trail includes **175 tracked autonomous agents** across planning, coding, review, merge, and verification.
-
-Details: [`examples/llm-rust-python-compiler-sonnet/README.md`](examples/llm-rust-python-compiler-sonnet/README.md)
-
-## Why SWE-AF
-
-Most agent frameworks are harnesses around a single coder loop. SWE-AF is a software engineering factory built from coordinated harnesses.
-
-- Hardness-aware execution: easy issues pass through quickly, while hard issues trigger deeper adaptation and DAG-level replanning instead of blind retries.
-- Factory architecture: planning, execution, and governance agents run as a coordinated control stack.
-- Continual learning (optional): with `enable_learning=true`, conventions and failure patterns discovered early are injected into downstream issues.
-- Agent-scale parallelism: dependency-level scheduling + isolated git worktrees allow large fan-out without branch collisions.
-- Fleet-scale orchestration with AgentField: many SWE-AF nodes can run continuously in parallel, driving thousands of agent invocations across concurrent builds.
-- Explicit compromise tracking: when scope is relaxed, debt is typed, severity-rated, and propagated.
-- Long-run reliability: checkpointed execution supports `resume_build` after crashes or interruptions.
-
-## In Action
-
-[PR #179: Go SDK DID/VC Registration](https://github.com/Agent-Field/agentfield/pull/179) — built entirely by SWE-AF (Claude runtime with haiku-class models). One API call, zero human code.
-
-| Metric              | Value              |
-| ------------------- | ------------------ |
-| Issues completed    | 10/10              |
-| Tests passing       | 217                |
-| Acceptance criteria | 34/34              |
-| Agent invocations   | 79                 |
-| Model               | `claude-haiku-4-5` |
-| **Total cost**      | **$19.23**         |
-
-<details>
-<summary>Cost breakdown by agent role</summary>
-
-| Role                               | Cost  | %     |
-| ---------------------------------- | ----- | ----- |
-| Coder                              | $5.88 | 30.6% |
-| Code Reviewer                      | $3.48 | 18.1% |
-| QA                                 | $1.78 | 9.2%  |
-| GitHub PR                          | $1.66 | 8.6%  |
-| Integration Tester                 | $1.59 | 8.3%  |
-| Merger                             | $1.22 | 6.3%  |
-| Workspace Ops                      | $1.77 | 9.2%  |
-| Planning (PM + Arch + TL + Sprint) | $0.79 | 4.1%  |
-| Verifier + Finalize                | $0.34 | 1.8%  |
-| Synthesizer                        | $0.05 | 0.2%  |
-
-79 invocations, 2,070 conversation turns. Planning agents scope and decompose; coders work in parallel isolated worktrees; reviewers and QA validate each issue; merger integrates branches; verifier checks acceptance criteria against the PRD.
-
-</details>
-
-**Claude & open-source models supported**: Run builds with either runtime and tune models per role in one flat config map.
-- `runtime: "claude_code"` maps to Claude backend.
-- `runtime: "open_code"` maps to OpenCode backend (OpenRouter/OpenAI/Google/Anthropic model IDs).
+<div align="center">
 
-## Adaptive Factory Control
+| Scored **95/100** on benchmark | 10/10 issues, **$19** total cost | **400–500+** agents per build | Claude, MiniMax, DeepSeek, Qwen |
+|:---:|:---:|:---:|:---:|
+| Beats Claude Code (73) & Codex (62) | [Real PR — zero human code](https://github.com/Agent-Field/agentfield/pull/179) | Planning → coding → QA → merge | Any model, any provider |
 
-SWE-AF uses three nested control loops to adapt to task difficulty in real time:
+</div>
 
-| Loop        | Scope         | Trigger              | Action                                                                             |
-| ----------- | ------------- | -------------------- | ---------------------------------------------------------------------------------- |
-| Inner loop  | Single issue  | QA/review fails      | Coder retries with feedback                                                        |
-| Middle loop | Single issue  | Inner loop exhausted | `run_issue_advisor` retries with a new approach, splits work, or accepts with debt |
-| Outer loop  | Remaining DAG | Escalated failures   | `run_replanner` restructures remaining issues and dependencies                     |
+## Features
 
-This is the core factory-control behavior: control agents supervise worker agents and continuously reshape the plan as reality changes.
+- **Factory, not a wrapper** — Planning, execution, and governance agents run as a coordinated control stack. Not just a coder loop with retries.
+- **Hardness-aware execution** — Easy issues pass through fast. Hard issues trigger deeper adaptation and DAG-level replanning instead of blind retries.
+- **Multi-model, multi-provider** — Assign different models per role (`coder: opus`, `qa: haiku`). Works with Claude, OpenRouter, OpenAI, and Google.
+- **Single-repo and multi-repo modes** — Point at one repository or orchestrate coordinated changes across multiple repos in a single build.
+- **Continual learning** — With `enable_learning: true`, conventions and failure patterns discovered early get injected into downstream issues.
+- **Agent-scale parallelism** — Dependency-level scheduling + isolated git worktrees allow large fan-out without branch collisions.
+- **Self-correcting builds** — Three nested control loops (inner retry → advisor adaptation → DAG replanning) handle failures automatically.
+- **Crash recovery** — Checkpointed execution supports `resume_build` after interruptions.
+- **Draft PR output** — Pass a `repo_url` and SWE-AF clones, builds, and opens a draft PR on GitHub.
 
 ## Quick Start
 
@@ -213,123 +65,93 @@ One click deploys SWE-AF + AgentField control plane + PostgreSQL. Set two enviro
 - `CLAUDE_CODE_OAUTH_TOKEN` — run `claude setup-token` in [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) (uses Pro/Max subscription credits)
 - `GH_TOKEN` — GitHub personal access token with `repo` scope for draft PR creation
 
-Once deployed, trigger a build:
+Then trigger a build:
 
 ```bash
-curl -X POST https://<control-plane>.up.railway.app/api/v1/execute/async/swe-planner.build \
+curl -X POST https://<your-app>.up.railway.app/api/v1/execute/async/swe-planner.build \
   -H "Content-Type: application/json" \
   -H "X-API-Key: this-is-a-secret" \
   -d '{"input": {"goal": "Add JWT auth", "repo_url": "https://github.com/user/my-repo"}}'
 ```
 
-### 1. Requirements (local)
+### Run Locally
 
-- Python 3.12+
-- AgentField control plane (`af`)
-- AI provider API key (Anthropic, OpenRouter, OpenAI, or Google)
+```bash
+# 1. Install
+python3.12 -m venv .venv && source .venv/bin/activate
+pip install -e ".[dev]"
 
-### 2. Install
+# 2. Start the control plane and register the node
+af                 # starts AgentField on :8080
+python -m swe_af   # registers node "swe-planner"
 
-```bash
-python3.12 -m venv .venv
-source .venv/bin/activate
-python -m pip install --upgrade pip
-python -m pip install -e ".[dev]"
+# 3. Trigger a build
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d '{"input": {"goal": "Add JWT auth", "repo_url": "https://github.com/user/my-project"}}'
 ```
 
-### 3. Run
+### Docker
 
 ```bash
-af                 # starts AgentField control plane on :8080
-python -m swe_af   # registers node id "swe-planner"
+cp .env.example .env   # add your API key + optional GH_TOKEN
+docker compose up -d
+
+# Scale workers
+docker compose up --scale swe-agent=3 -d
 ```
 
-### 4. Trigger a build
+Use a host control plane instead of the Docker service:
 
 ```bash
-# Default (uses Claude)
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d @- <<'JSON'
-{
-  "input": {
-    "goal": "Add JWT auth to all API endpoints",
-    "repo_url": "https://github.com/user/my-project"
-  }
-}
-JSON
+docker compose -f docker-compose.local.yml up -d
+```
 
-# With open-source runtime + flat role map
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d @- <<'JSON'
-{
-  "input": {
-    "goal": "Add JWT auth",
-    "repo_url": "https://github.com/user/my-project",
-    "config": {
-      "runtime": "open_code",
-      "models": {
-        "default": "openrouter/minimax/minimax-m2.5"
-      }
-    }
-  }
-}
-JSON
+## How a Build Works
 
-# Local workspace mode (repo_path) + targeted role override
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d @- <<'JSON'
-{
-  "input": {
-    "goal": "Refactor and harden auth + billing flows",
-    "repo_path": "/path/to/repo",
-    "config": {
-      "runtime": "claude_code",
-      "models": {
-        "default": "sonnet",
-        "coder": "opus",
-        "qa": "opus"
-      },
-      "enable_learning": true
-    }
-  }
-}
-JSON
+```text
+Goal → PM → Architect → Tech Lead → Sprint Planner → Issue DAG
+                                                         ↓
+                        ┌────────────────────────────────┘
+                        ↓ (parallel, isolated worktrees)
+                   ┌─────────┐
+                   │  Issue N │ → Coder → QA → Reviewer → Synthesizer
+                   └─────────┘       ↑               │
+                        ↑            └── retry ───────┘ (inner loop)
+                        │
+                        └── advisor / replanner (middle + outer loops)
+                                                         ↓
+                                    Merge → Integration Test → Verify → Draft PR
 ```
 
-For OpenRouter with `open_code`, use model IDs in `openrouter/<provider>/<model>` format (for example `openrouter/minimax/minimax-m2.5`).
-
-## What Happens In One Build
+Three nested control loops handle task difficulty in real time:
 
-- Architecture is generated and reviewed before coding starts
-- Issues are dependency-sorted and run in parallel across isolated worktrees
-- Each issue gets dedicated coder, tester, and reviewer passes
-- Failed issues trigger advisor-driven adaptation (split, re-scope, or escalate)
-- Escalations trigger replanning of the remaining DAG
-- End result is merged, integration-tested, and verified against acceptance criteria
+| Loop | Scope | Trigger | Action |
+|------|-------|---------|--------|
+| **Inner** | Single issue | QA/review fails | Coder retries with feedback |
+| **Middle** | Single issue | Inner loop exhausted | Advisor retries with new approach, splits work, or accepts with debt |
+| **Outer** | Remaining DAG | Escalated failures | Replanner restructures remaining issues and dependencies |
 
 <p align="center">
   <img src="assets/archi.png" alt="SWE-AF architecture" width="100%" />
 </p>
 
-> Typical runs spin up 400-500+ agent instances across planning, execution, QA, and verification. For larger DAGs and repeated adaptation/replanning cycles, SWE-AF can scale into the high hundreds to thousands of agent invocations in a single build.
+> Typical runs spin up 400–500+ agent instances across planning, execution, QA, and verification. Larger DAGs and repeated adaptation cycles scale into the thousands.
 
-## Benchmark Snapshot
+## Benchmark
 
-**95/100 with haiku and MiniMax**: SWE-AF scored 95/100 with both Claude haiku-class routing ($20) and MiniMax M2.5 via open runtime ($6), outperforming Claude Code sonnet (73), Codex o3 (62), and Claude Code haiku (59) on the same prompt.
+**95/100** with both Claude haiku-class routing ($20) and MiniMax M2.5 via open runtime ($6), outperforming Claude Code sonnet (73), Codex o3 (62), and Claude Code haiku (59) on the same prompt.
 
-| Dimension       | SWE-AF (haiku) | SWE-AF (MiniMax) | CC Sonnet | Codex (o3) | CC Haiku |
-| --------------- | -------------- | ---------------- | --------- | ---------- | -------- |
-| Functional (30) | **30**         | **30**           | **30**    | **30**     | **30**   |
-| Structure (20)  | **20**         | **20**           | 10        | 10         | 10       |
-| Hygiene (20)    | **20**         | **20**           | 16        | 10         | 7        |
-| Git (15)        | **15**         | **15**           | 2         | 2          | 2        |
-| Quality (15)    | 10             | 10               | **15**    | 10         | 10       |
-| Total           | **95**         | **95**           | **73**    | **62**     | **59**   |
-| **Cost**        | **~$20**       | **~$6**          | ?         | ?          | ?        |
-| **Time**        | ~30-40 min     | 43 min           | ?         | ?          | ?        |
+| Dimension | SWE-AF (haiku) | SWE-AF (MiniMax) | CC Sonnet | Codex (o3) | CC Haiku |
+|-----------|---------------|-----------------|-----------|-----------|---------|
+| Functional (30) | **30** | **30** | **30** | **30** | **30** |
+| Structure (20) | **20** | **20** | 10 | 10 | 10 |
+| Hygiene (20) | **20** | **20** | 16 | 10 | 7 |
+| Git (15) | **15** | **15** | 2 | 2 | 2 |
+| Quality (15) | 10 | 10 | **15** | 10 | 10 |
+| **Total** | **95** | **95** | **73** | **62** | **59** |
+| **Cost** | **~$20** | **~$6** | ? | ? | ? |
+| **Time** | ~30–40 min | 43 min | ? | ? | ? |
 
 <details>
 <summary><strong>Full benchmark details and reproduction</strong></summary>
@@ -342,13 +164,13 @@ Same prompt tested across multiple agents. SWE-AF with Claude runtime (haiku-cla
 
 ### Scoring framework
 
-| Dimension  | Points | What it measures                                 |
-| ---------- | ------ | ------------------------------------------------ |
-| Functional | 30     | CLI behavior and passing tests                   |
-| Structure  | 20     | Modular source layout and test organization      |
-| Hygiene    | 20     | `.gitignore`, clean status, no junk artifacts    |
-| Git        | 15     | Commit discipline and message quality            |
-| Quality    | 15     | Error handling, package metadata, README quality |
+| Dimension | Points | What it measures |
+|-----------|--------|-----------------|
+| Functional | 30 | CLI behavior and passing tests |
+| Structure | 20 | Modular source layout and test organization |
+| Hygiene | 20 | `.gitignore`, clean status, no junk artifacts |
+| Git | 15 | Commit discipline and message quality |
+| Quality | 15 | Error handling, package metadata, README quality |
 
 ### Reproduction
 
@@ -390,13 +212,13 @@ curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
 JSON
 
 # Claude Code (haiku)
-claude -p "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --model haiku --dangerously-skip-permissions
+claude -p "Build a Node.js CLI todo app ..." --model haiku --dangerously-skip-permissions
 
 # Claude Code (sonnet)
-claude -p "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --model sonnet --dangerously-skip-permissions
+claude -p "Build a Node.js CLI todo app ..." --model sonnet --dangerously-skip-permissions
 
 # Codex (gpt-5.3-codex)
-codex exec "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --full-auto
+codex exec "Build a Node.js CLI todo app ..." --full-auto
 ```
 
 **MiniMax M2.5 Measured Metrics (Feb 2026):**
@@ -410,77 +232,121 @@ Benchmark assets, logs, evaluator, and generated projects live in [`examples/age
 
 </details>
 
-## Docker
+## Real-World Examples
 
-```bash
-cp .env.example .env
-# Add your API key: ANTHROPIC_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY
-# Optionally add GH_TOKEN for draft PR workflow
+### PR #179: Go SDK — Built Entirely by SWE-AF
 
-docker compose up -d
-```
+[PR #179: Go SDK DID/VC Registration](https://github.com/Agent-Field/agentfield/pull/179) — one API call, zero human code, haiku-class models.
 
-Submit a build:
+| Metric | Value |
+|--------|-------|
+| Issues completed | 10/10 |
+| Tests passing | 217 |
+| Acceptance criteria | 34/34 |
+| Agent invocations | 79 |
+| Model | `claude-haiku-4-5` |
+| **Total cost** | **$19.23** |
 
-```bash
-# Default (Claude)
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d @- <<'JSON'
-{
-  "input": {
-    "goal": "Add JWT auth",
-    "repo_url": "https://github.com/user/my-repo"
-  }
-}
-JSON
+<details>
+<summary>Cost breakdown by agent role</summary>
 
-# With open-source runtime (set OPENROUTER_API_KEY in .env)
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d @- <<'JSON'
-{
-  "input": {
-    "goal": "Add JWT auth",
-    "repo_url": "https://github.com/user/my-repo",
-    "config": {
-      "runtime": "open_code",
-      "models": {
-        "default": "openrouter/minimax/minimax-m2.5"
-      }
-    }
-  }
-}
-JSON
+| Role | Cost | % |
+|------|------|---|
+| Coder | $5.88 | 30.6% |
+| Code Reviewer | $3.48 | 18.1% |
+| QA | $1.78 | 9.2% |
+| GitHub PR | $1.66 | 8.6% |
+| Integration Tester | $1.59 | 8.3% |
+| Merger | $1.22 | 6.3% |
+| Workspace Ops | $1.77 | 9.2% |
+| Planning (PM + Arch + TL + Sprint) | $0.79 | 4.1% |
+| Verifier + Finalize | $0.34 | 1.8% |
+| Synthesizer | $0.05 | 0.2% |
+
+79 invocations, 2,070 conversation turns. Planning agents scope and decompose; coders work in parallel isolated worktrees; reviewers and QA validate each issue; merger integrates branches; verifier checks acceptance criteria against the PRD.
+
+</details>
+
+### Autonomous Build Spotlight
+
+Rust-based Python compiler benchmark (built autonomously):
+
+| Metric | CPython (subprocess) | RustPython (SWE-AF) | Improvement |
+|--------|---------------------|---------------------|-------------|
+| Steady-state execution | Baseline (~19ms) | Optimized in-process runtime | **88.3x–602.3x faster** |
+| Geometric mean | 1.0x baseline | 253.8x | **253.8x** |
+| Peak throughput | ~52 ops/s | 31,807 ops/s | **~612x** |
+
+<details>
+<summary>Measurement methodology</summary>
+
+Throughput comparison measures different execution models: CPython subprocess spawn (~19ms per call → ~52 ops/s) vs RustPython pre-warmed interpreter pool (in-process). This is the real-world tradeoff the system was built to optimize — replacing repeated subprocess invocations with a persistent pool for short-snippet execution.
+
+</details>
+
+Artifact trail includes **175 tracked autonomous agents** across planning, coding, review, merge, and verification.
+
+Details: [`examples/llm-rust-python-compiler-sonnet/README.md`](examples/llm-rust-python-compiler-sonnet/README.md)
+
+## Operating Modes
 
-# Local workspace mode (repo_path)
+SWE-AF works in two modes: point it at a single repository, or orchestrate coordinated changes across multiple repos in one build.
+
+### Single-Repository Mode
+
+The default. Pass `repo_url` (remote) or `repo_path` (local) and SWE-AF handles everything:
+
+```bash
 curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
   -H "Content-Type: application/json" \
-  -d @- <<'JSON'
-{
-  "input": {
-    "goal": "Add JWT auth",
-    "repo_path": "/workspaces/my-repo"
-  }
-}
-JSON
+  -d '{
+    "input": {
+      "goal": "Add JWT auth",
+      "repo_url": "https://github.com/user/my-project"
+    }
+  }'
 ```
 
-Scale workers:
+### Multi-Repository Mode
+
+When your work spans multiple codebases — a primary app plus shared libraries, monorepo sub-projects, or dependent microservices — pass `config.repos` as an array with roles:
 
 ```bash
-docker compose up --scale swe-agent=3 -d
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d '{
+    "input": {
+      "goal": "Add JWT auth across API and shared-lib",
+      "config": {
+        "repos": [
+          {
+            "repo_url": "https://github.com/org/main-app",
+            "role": "primary"
+          },
+          {
+            "repo_url": "https://github.com/org/shared-lib",
+            "role": "dependency"
+          }
+        ],
+        "runtime": "claude_code",
+        "models": { "default": "sonnet" }
+      }
+    }
+  }'
 ```
 
-Use a host control plane instead of Docker control-plane service:
+**Roles:**
+- `primary` — The main application. Changes here drive the build; failures block progress.
+- `dependency` — Libraries or services modified to support the primary repo. Failures are captured but don't block.
 
-```bash
-docker compose -f docker-compose.local.yml up -d
-```
+**Use cases:**
+- Primary app + shared SDK or utilities library
+- Monorepo sub-projects that live in separate repos
+- Feature spanning multiple microservices (e.g., API + worker queue)
 
-## GitHub Repo Workflow (Clone -> Build -> Draft PR)
+## One-Call DX
 
-Pass `repo_url` instead of `repo_path` to let SWE-AF clone and open a draft PR after execution.
+Every build is a single API call. Swap runtimes and assign models per agent role in one flat config:
 
 ```bash
 curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
@@ -488,25 +354,35 @@ curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
   -d @- <<'JSON'
 {
   "input": {
+    "goal": "Refactor and harden auth + billing flows",
     "repo_url": "https://github.com/user/my-project",
-    "goal": "Add comprehensive test coverage",
     "config": {
       "runtime": "claude_code",
       "models": {
         "default": "sonnet",
         "coder": "opus",
         "qa": "opus"
-      }
+      },
+      "enable_learning": true
     }
   }
 }
 JSON
 ```
 
-Requirements:
+**Runtimes supported:**
+- `runtime: "claude_code"` — Claude backend (Anthropic)
+- `runtime: "open_code"` — OpenCode backend (OpenRouter/OpenAI/Google/Anthropic model IDs)
 
-- `GH_TOKEN` in `.env` with `repo` scope
-- Repo access for that token
+For OpenRouter, use model IDs in `openrouter/<provider>/<model>` format (e.g., `openrouter/minimax/minimax-m2.5`).
+
+## GitHub Repo Workflow
+
+Pass `repo_url` instead of `repo_path` to let SWE-AF clone the repo and open a draft PR after the build completes.
+
+Requirements:
+- `GH_TOKEN` environment variable with `repo` scope
+- Token must have access to the target repository
 
 ## API Reference
 
@@ -516,7 +392,7 @@ Requirements:
 Core async endpoints (returns an `execution_id` immediately):
 
 ```bash
-# Full build: plan -> execute -> verify
+# Full build: plan → execute → verify
 POST /api/v1/execute/async/swe-planner.build
 
 # Plan only
@@ -544,24 +420,24 @@ Every specialist is also callable directly:
 <details>
 <summary><strong>Agent execution flow</strong></summary>
 
-| Agent                    | In -> Out                                            |
-| ------------------------ | ---------------------------------------------------- |
-| `run_product_manager`    | goal -> PRD                                          |
-| `run_architect`          | PRD -> architecture                                  |
-| `run_tech_lead`          | architecture -> review                               |
-| `run_sprint_planner`     | architecture -> issue DAG                            |
-| `run_issue_writer`       | issue spec -> detailed issue                         |
-| `run_coder`              | issue + worktree -> code + tests + commit            |
-| `run_qa`                 | worktree -> test results                             |
-| `run_code_reviewer`      | worktree -> quality/security review                  |
-| `run_qa_synthesizer`     | QA + review -> FIX / APPROVE / BLOCK                 |
-| `run_issue_advisor`      | failure context -> adapt / split / accept / escalate |
-| `run_replanner`          | build state + failures -> restructured plan          |
-| `run_merger`             | branches -> merged output                            |
-| `run_integration_tester` | merged repo -> integration results                   |
-| `run_verifier`           | repo + PRD -> acceptance pass/fail                   |
-| `generate_fix_issues`    | failed criteria -> targeted fix issues               |
-| `run_github_pr`          | branch -> push + draft PR                            |
+| Agent | In → Out |
+|-------|----------|
+| `run_product_manager` | goal → PRD |
+| `run_architect` | PRD → architecture |
+| `run_tech_lead` | architecture → review |
+| `run_sprint_planner` | architecture → issue DAG |
+| `run_issue_writer` | issue spec → detailed issue |
+| `run_coder` | issue + worktree → code + tests + commit |
+| `run_qa` | worktree → test results |
+| `run_code_reviewer` | worktree → quality/security review |
+| `run_qa_synthesizer` | QA + review → FIX / APPROVE / BLOCK |
+| `run_issue_advisor` | failure context → adapt / split / accept / escalate |
+| `run_replanner` | build state + failures → restructured plan |
+| `run_merger` | branches → merged output |
+| `run_integration_tester` | merged repo → integration results |
+| `run_verifier` | repo + PRD → acceptance pass/fail |
+| `generate_fix_issues` | failed criteria → targeted fix issues |
+| `run_github_pr` | branch → push + draft PR |
 
 </details>
 
@@ -570,23 +446,23 @@ Every specialist is also callable directly:
 
 Pass `config` to `build` or `execute`. Full schema: [`swe_af/execution/schemas.py`](swe_af/execution/schemas.py)
 
-| Key                       | Default         | Description                                           |
-| ------------------------- | --------------- | ----------------------------------------------------- |
-| `runtime`                 | `"claude_code"` | Model runtime: `"claude_code"` or `"open_code"`       |
-| `models`                  | `null`          | Flat role-model map (`default` + role keys below)     |
-| `max_coding_iterations`   | `5`             | Inner-loop retry budget                               |
-| `max_advisor_invocations` | `2`             | Middle-loop advisor budget                            |
-| `max_replans`             | `2`             | Build-level replanning budget                         |
-| `enable_issue_advisor`    | `true`          | Enable issue adaptation                               |
-| `enable_replanning`       | `true`          | Enable global replanning                              |
-| `enable_learning`         | `false`         | Enable cross-issue shared memory (continual learning) |
-| `agent_timeout_seconds`   | `2700`          | Per-agent timeout                                     |
-| `agent_max_turns`         | `150`           | Tool-use turn budget                                  |
+| Key | Default | Description |
+|-----|---------|-------------|
+| `runtime` | `"claude_code"` | Model runtime: `"claude_code"` or `"open_code"` |
+| `models` | `null` | Flat role→model map (`default` + role keys below) |
+| `max_coding_iterations` | `5` | Inner-loop retry budget |
+| `max_advisor_invocations` | `2` | Middle-loop advisor budget |
+| `max_replans` | `2` | Build-level replanning budget |
+| `enable_issue_advisor` | `true` | Enable issue adaptation |
+| `enable_replanning` | `true` | Enable global replanning |
+| `enable_learning` | `false` | Enable cross-issue shared memory (continual learning) |
+| `agent_timeout_seconds` | `2700` | Per-agent timeout |
+| `agent_max_turns` | `150` | Tool-use turn budget |
 
 </details>
 
 <details>
-<summary><strong>Model Role Keys</strong></summary>
+<summary><strong>Model role keys</strong></summary>
 
 `models` supports:
 
@@ -596,12 +472,7 @@ Pass `config` to `build` or `execute`. Full schema: [`swe_af/execution/schemas.p
 - `replan`, `retry_advisor`, `issue_writer`, `issue_advisor`
 - `verifier`, `git`, `merger`, `integration_tester`
 
-</details>
-
-<details>
-<summary><strong>Resolution order</strong></summary>
-
-`runtime defaults` < `models.default` < `models.<role>`
+Resolution order: `runtime defaults` < `models.default` < `models.<role>`
 
 </details>
 
@@ -661,7 +532,7 @@ make clean-examples
 </details>
 
 <details>
-<summary><strong>Security and Community</strong></summary>
+<summary><strong>Security and community</strong></summary>
 
 - Contribution guide: [`docs/CONTRIBUTING.md`](docs/CONTRIBUTING.md)
 - Code of conduct: [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md)
@@ -673,4 +544,4 @@ make clean-examples
 
 ---
 
-SWE-AF is built on [AgentField](https://github.com/Agent-Field/agentfield) as a first step from single-agent harnesses to autonomous software engineering factories.
+SWE-AF is built on [AgentField](https://github.com/Agent-Field/agentfield) — a first step from single-agent harnesses to autonomous software engineering factories.

From b2a06f02bf21a44cab741abf56a37813a59c59c2 Mon Sep 17 00:00:00 2001
From: Claude <noreply@anthropic.com>
Date: Wed, 25 Feb 2026 13:46:41 +0000
Subject: [PATCH 2/2] Reposition single/multi-repo as equal operating modes,
 improve README polish
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Rename "Multi-Repository Workspace Support" → "Operating Modes" with
  "Single-Repository Mode" and "Multi-Repository Mode" as equal subsections
- Remove all "backward compatible" and "exactly as before" framing
- Move use cases into Multi-Repository Mode section (where they're relevant)
- Add tagline "One API call → full engineering team → shipped code."
- Bold lead phrases in Why SWE-AF for scannability
- Add multi-model/multi-provider as a Why SWE-AF bullet
- Sharpen Why SWE-AF opening line
- Add "Modes" to nav links, rename "Benchmark Snapshot" → "Benchmark"
- Enumerate agent roles in intro paragraph for clarity

https://claude.ai/code/session_01XgBbDVMbE52cKdU1dCNEsV
---
 README.md | 592 +++++++++++++++++++++++++++++++++---------------------
 1 file changed, 361 insertions(+), 231 deletions(-)

diff --git a/README.md b/README.md
index f2790b3..36a39fb 100644
--- a/README.md
+++ b/README.md
@@ -18,8 +18,9 @@
 
 <p>
   <a href="#quick-start">Quick Start</a> •
-  <a href="#features">Features</a> •
-  <a href="#how-a-build-works">How It Works</a> •
+  <a href="#why-swe-af">Why SWE-AF</a> •
+  <a href="#in-action">In Action</a> •
+  <a href="#adaptive-factory-control">Factory Control</a> •
   <a href="#benchmark">Benchmark</a> •
   <a href="#operating-modes">Modes</a> •
   <a href="#api-reference">API</a> •
@@ -28,31 +29,179 @@
 
 </div>
 
-SWE-AF spins up a coordinated fleet of AI agents — product managers, architects, coders, reviewers, testers — that scope, build, adapt, and ship software end to end. No scaffolding, no human-in-the-loop. One goal in, verified PR out.
+One API call spins up a full autonomous engineering team — product managers, architects, coders, reviewers, testers — that scopes, builds, adapts, and ships complex software end to end.
+SWE-AF is a first step toward **autonomous software engineering factories**, scaling from simple goals to hard multi-issue programs with hundreds to thousands of agent invocations.
 
 <p align="center">
   <img src="assets/banner.jpg" alt="SWE-AF autonomous engineering fleet banner" width="100%" />
 </p>
 
-<div align="center">
+## One-Call DX
 
-| Scored **95/100** on benchmark | 10/10 issues, **$19** total cost | **400–500+** agents per build | Claude, MiniMax, DeepSeek, Qwen |
-|:---:|:---:|:---:|:---:|
-| Beats Claude Code (73) & Codex (62) | [Real PR — zero human code](https://github.com/Agent-Field/agentfield/pull/179) | Planning → coding → QA → merge | Any model, any provider |
+```bash
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d @- <<'JSON'
+{
+  "input": {
+    "goal": "Refactor and harden auth + billing flows",
+    "repo_url": "https://github.com/user/my-project",
+    "config": {
+      "runtime": "claude_code",
+      "models": {
+        "default": "sonnet",
+        "coder": "opus",
+        "qa": "opus"
+      },
+      "enable_learning": true
+    }
+  }
+}
+JSON
+```
 
-</div>
+Swap `models.default` and any role key (`coder`, `qa`, `architect`, etc.) to any model your runtime supports.
+
+## Operating Modes
+
+SWE-AF works in two modes: point it at a single repository, or orchestrate coordinated changes across multiple repos in one build.
+
+### Single-Repository Mode
+
+The default. Pass `repo_url` (remote) or `repo_path` (local) and SWE-AF handles everything:
+
+```bash
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d '{
+    "input": {
+      "goal": "Add JWT auth",
+      "repo_url": "https://github.com/user/my-project"
+    }
+  }'
+```
+
+### Multi-Repository Mode
+
+When your work spans multiple codebases — a primary app plus shared libraries, monorepo sub-projects, or dependent microservices — pass `config.repos` as an array with roles:
+
+```bash
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d '{
+    "input": {
+      "goal": "Add JWT auth across API and shared-lib",
+      "config": {
+        "repos": [
+          {
+            "repo_url": "https://github.com/org/main-app",
+            "role": "primary"
+          },
+          {
+            "repo_url": "https://github.com/org/shared-lib",
+            "role": "dependency"
+          }
+        ],
+        "runtime": "claude_code",
+        "models": {
+          "default": "sonnet"
+        }
+      }
+    }
+  }'
+```
 
-## Features
+**Roles:**
+- `primary` — The main application. Changes here drive the build; failures block progress.
+- `dependency` — Libraries or services modified to support the primary repo. Failures are captured but don't block.
+
+**Use cases:**
+- Primary app + shared SDK or utilities library
+- Monorepo sub-projects that live in separate repos
+- Feature spanning multiple microservices (e.g., API + worker queue)
+
+## Autonomous Build Spotlight
+
+Rust-based Python compiler benchmark (built autonomously):
+
+| Metric                 | CPython (subprocess) | RustPython (SWE-AF)          | Improvement             |
+| ---------------------- | -------------------- | ---------------------------- | ----------------------- |
+| Steady-state execution | Baseline (~19ms)     | Optimized in-process runtime | **88.3x-602.3x faster** |
+| Geometric mean         | 1.0x baseline        | 253.8x                       | **253.8x**              |
+| Peak throughput        | ~52 ops/s            | 31,807 ops/s                 | **~612x**               |
+
+<details>
+<summary>Measurement methodology</summary>
+
+Throughput comparison measures different execution models: CPython subprocess spawn (~19ms per call → ~52 ops/s) vs RustPython pre-warmed interpreter pool (in-process). This is the real-world tradeoff the system was built to optimize — replacing repeated subprocess invocations with a persistent pool for short-snippet execution.
+
+</details>
+
+Artifact trail includes **175 tracked autonomous agents** across planning, coding, review, merge, and verification.
+
+Details: [`examples/llm-rust-python-compiler-sonnet/README.md`](examples/llm-rust-python-compiler-sonnet/README.md)
+
+## Why SWE-AF
+
+Most agent frameworks wrap a single coder loop. SWE-AF is a coordinated engineering factory — planning, execution, and governance agents run as a control stack that adapts in real time.
+
+- **Hardness-aware execution** — easy issues pass through quickly, while hard issues trigger deeper adaptation and DAG-level replanning instead of blind retries.
+- **Factory architecture** — not a single-agent wrapper. Planning, execution, and governance agents run as a coordinated control stack.
+- **Multi-model, multi-provider** — assign different models per role (`coder: opus`, `qa: haiku`). Works with Claude, OpenRouter, OpenAI, and Google.
+- **Continual learning** — with `enable_learning=true`, conventions and failure patterns discovered early are injected into downstream issues.
+- **Agent-scale parallelism** — dependency-level scheduling + isolated git worktrees allow large fan-out without branch collisions.
+- **Fleet-scale orchestration** — many SWE-AF nodes can run continuously in parallel via AgentField, driving thousands of agent invocations across concurrent builds.
+- **Explicit compromise tracking** — when scope is relaxed, debt is typed, severity-rated, and propagated.
+- **Long-run reliability** — checkpointed execution supports `resume_build` after crashes or interruptions.
+
+## In Action
+
+[PR #179: Go SDK DID/VC Registration](https://github.com/Agent-Field/agentfield/pull/179) — built entirely by SWE-AF (Claude runtime with haiku-class models). One API call, zero human code.
+
+| Metric              | Value              |
+| ------------------- | ------------------ |
+| Issues completed    | 10/10              |
+| Tests passing       | 217                |
+| Acceptance criteria | 34/34              |
+| Agent invocations   | 79                 |
+| Model               | `claude-haiku-4-5` |
+| **Total cost**      | **$19.23**         |
+
+<details>
+<summary>Cost breakdown by agent role</summary>
+
+| Role                               | Cost  | %     |
+| ---------------------------------- | ----- | ----- |
+| Coder                              | $5.88 | 30.6% |
+| Code Reviewer                      | $3.48 | 18.1% |
+| QA                                 | $1.78 | 9.2%  |
+| GitHub PR                          | $1.66 | 8.6%  |
+| Integration Tester                 | $1.59 | 8.3%  |
+| Merger                             | $1.22 | 6.3%  |
+| Workspace Ops                      | $1.77 | 9.2%  |
+| Planning (PM + Arch + TL + Sprint) | $0.79 | 4.1%  |
+| Verifier + Finalize                | $0.34 | 1.8%  |
+| Synthesizer                        | $0.05 | 0.2%  |
+
+79 invocations, 2,070 conversation turns. Planning agents scope and decompose; coders work in parallel isolated worktrees; reviewers and QA validate each issue; merger integrates branches; verifier checks acceptance criteria against the PRD.
 
-- **Factory, not a wrapper** — Planning, execution, and governance agents run as a coordinated control stack. Not just a coder loop with retries.
-- **Hardness-aware execution** — Easy issues pass through fast. Hard issues trigger deeper adaptation and DAG-level replanning instead of blind retries.
-- **Multi-model, multi-provider** — Assign different models per role (`coder: opus`, `qa: haiku`). Works with Claude, OpenRouter, OpenAI, and Google.
-- **Single-repo and multi-repo modes** — Point at one repository or orchestrate coordinated changes across multiple repos in a single build.
-- **Continual learning** — With `enable_learning: true`, conventions and failure patterns discovered early get injected into downstream issues.
-- **Agent-scale parallelism** — Dependency-level scheduling + isolated git worktrees allow large fan-out without branch collisions.
-- **Self-correcting builds** — Three nested control loops (inner retry → advisor adaptation → DAG replanning) handle failures automatically.
-- **Crash recovery** — Checkpointed execution supports `resume_build` after interruptions.
-- **Draft PR output** — Pass a `repo_url` and SWE-AF clones, builds, and opens a draft PR on GitHub.
+</details>
+
+**Claude & open-source models supported**: Run builds with either runtime and tune models per role in one flat config map.
+- `runtime: "claude_code"` maps to Claude backend.
+- `runtime: "open_code"` maps to OpenCode backend (OpenRouter/OpenAI/Google/Anthropic model IDs).
+
+## Adaptive Factory Control
+
+SWE-AF uses three nested control loops to adapt to task difficulty in real time:
+
+| Loop        | Scope         | Trigger              | Action                                                                             |
+| ----------- | ------------- | -------------------- | ---------------------------------------------------------------------------------- |
+| Inner loop  | Single issue  | QA/review fails      | Coder retries with feedback                                                        |
+| Middle loop | Single issue  | Inner loop exhausted | `run_issue_advisor` retries with a new approach, splits work, or accepts with debt |
+| Outer loop  | Remaining DAG | Escalated failures   | `run_replanner` restructures remaining issues and dependencies                     |
+
+This is the core factory-control behavior: control agents supervise worker agents and continuously reshape the plan as reality changes.
 
 ## Quick Start
 
@@ -65,93 +214,123 @@ One click deploys SWE-AF + AgentField control plane + PostgreSQL. Set two enviro
 - `CLAUDE_CODE_OAUTH_TOKEN` — run `claude setup-token` in [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) (uses Pro/Max subscription credits)
 - `GH_TOKEN` — GitHub personal access token with `repo` scope for draft PR creation
 
-Then trigger a build:
+Once deployed, trigger a build:
 
 ```bash
-curl -X POST https://<your-app>.up.railway.app/api/v1/execute/async/swe-planner.build \
+curl -X POST https://<control-plane>.up.railway.app/api/v1/execute/async/swe-planner.build \
   -H "Content-Type: application/json" \
   -H "X-API-Key: this-is-a-secret" \
   -d '{"input": {"goal": "Add JWT auth", "repo_url": "https://github.com/user/my-repo"}}'
 ```
 
-### Run Locally
+### 1. Requirements (local)
 
-```bash
-# 1. Install
-python3.12 -m venv .venv && source .venv/bin/activate
-pip install -e ".[dev]"
+- Python 3.12+
+- AgentField control plane (`af`)
+- AI provider API key (Anthropic, OpenRouter, OpenAI, or Google)
 
-# 2. Start the control plane and register the node
-af                 # starts AgentField on :8080
-python -m swe_af   # registers node "swe-planner"
+### 2. Install
 
-# 3. Trigger a build
-curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
-  -H "Content-Type: application/json" \
-  -d '{"input": {"goal": "Add JWT auth", "repo_url": "https://github.com/user/my-project"}}'
+```bash
+python3.12 -m venv .venv
+source .venv/bin/activate
+python -m pip install --upgrade pip
+python -m pip install -e ".[dev]"
 ```
 
-### Docker
+### 3. Run
 
 ```bash
-cp .env.example .env   # add your API key + optional GH_TOKEN
-docker compose up -d
-
-# Scale workers
-docker compose up --scale swe-agent=3 -d
+af                 # starts AgentField control plane on :8080
+python -m swe_af   # registers node id "swe-planner"
 ```
 
-Use a host control plane instead of the Docker service:
+### 4. Trigger a build
 
 ```bash
-docker compose -f docker-compose.local.yml up -d
-```
+# Default (uses Claude)
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d @- <<'JSON'
+{
+  "input": {
+    "goal": "Add JWT auth to all API endpoints",
+    "repo_url": "https://github.com/user/my-project"
+  }
+}
+JSON
 
-## How a Build Works
+# With open-source runtime + flat role map
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d @- <<'JSON'
+{
+  "input": {
+    "goal": "Add JWT auth",
+    "repo_url": "https://github.com/user/my-project",
+    "config": {
+      "runtime": "open_code",
+      "models": {
+        "default": "openrouter/minimax/minimax-m2.5"
+      }
+    }
+  }
+}
+JSON
 
-```text
-Goal → PM → Architect → Tech Lead → Sprint Planner → Issue DAG
-                                                         ↓
-                        ┌────────────────────────────────┘
-                        ↓ (parallel, isolated worktrees)
-                   ┌─────────┐
-                   │  Issue N │ → Coder → QA → Reviewer → Synthesizer
-                   └─────────┘       ↑               │
-                        ↑            └── retry ───────┘ (inner loop)
-                        │
-                        └── advisor / replanner (middle + outer loops)
-                                                         ↓
-                                    Merge → Integration Test → Verify → Draft PR
+# Local workspace mode (repo_path) + targeted role override
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d @- <<'JSON'
+{
+  "input": {
+    "goal": "Refactor and harden auth + billing flows",
+    "repo_path": "/path/to/repo",
+    "config": {
+      "runtime": "claude_code",
+      "models": {
+        "default": "sonnet",
+        "coder": "opus",
+        "qa": "opus"
+      },
+      "enable_learning": true
+    }
+  }
+}
+JSON
 ```
 
-Three nested control loops handle task difficulty in real time:
+For OpenRouter with `open_code`, use model IDs in `openrouter/<provider>/<model>` format (for example `openrouter/minimax/minimax-m2.5`).
+
+## What Happens In One Build
 
-| Loop | Scope | Trigger | Action |
-|------|-------|---------|--------|
-| **Inner** | Single issue | QA/review fails | Coder retries with feedback |
-| **Middle** | Single issue | Inner loop exhausted | Advisor retries with new approach, splits work, or accepts with debt |
-| **Outer** | Remaining DAG | Escalated failures | Replanner restructures remaining issues and dependencies |
+- Architecture is generated and reviewed before coding starts
+- Issues are dependency-sorted and run in parallel across isolated worktrees
+- Each issue gets dedicated coder, tester, and reviewer passes
+- Failed issues trigger advisor-driven adaptation (split, re-scope, or escalate)
+- Escalations trigger replanning of the remaining DAG
+- End result is merged, integration-tested, and verified against acceptance criteria
 
 <p align="center">
   <img src="assets/archi.png" alt="SWE-AF architecture" width="100%" />
 </p>
 
-> Typical runs spin up 400–500+ agent instances across planning, execution, QA, and verification. Larger DAGs and repeated adaptation cycles scale into the thousands.
+> Typical runs spin up 400-500+ agent instances across planning, execution, QA, and verification. For larger DAGs and repeated adaptation/replanning cycles, SWE-AF can scale into the high hundreds to thousands of agent invocations in a single build.
 
 ## Benchmark
 
-**95/100** with both Claude haiku-class routing ($20) and MiniMax M2.5 via open runtime ($6), outperforming Claude Code sonnet (73), Codex o3 (62), and Claude Code haiku (59) on the same prompt.
+**95/100 with haiku and MiniMax**: SWE-AF scored 95/100 with both Claude haiku-class routing ($20) and MiniMax M2.5 via open runtime ($6), outperforming Claude Code sonnet (73), Codex o3 (62), and Claude Code haiku (59) on the same prompt.
 
-| Dimension | SWE-AF (haiku) | SWE-AF (MiniMax) | CC Sonnet | Codex (o3) | CC Haiku |
-|-----------|---------------|-----------------|-----------|-----------|---------|
-| Functional (30) | **30** | **30** | **30** | **30** | **30** |
-| Structure (20) | **20** | **20** | 10 | 10 | 10 |
-| Hygiene (20) | **20** | **20** | 16 | 10 | 7 |
-| Git (15) | **15** | **15** | 2 | 2 | 2 |
-| Quality (15) | 10 | 10 | **15** | 10 | 10 |
-| **Total** | **95** | **95** | **73** | **62** | **59** |
-| **Cost** | **~$20** | **~$6** | ? | ? | ? |
-| **Time** | ~30–40 min | 43 min | ? | ? | ? |
+| Dimension       | SWE-AF (haiku) | SWE-AF (MiniMax) | CC Sonnet | Codex (o3) | CC Haiku |
+| --------------- | -------------- | ---------------- | --------- | ---------- | -------- |
+| Functional (30) | **30**         | **30**           | **30**    | **30**     | **30**   |
+| Structure (20)  | **20**         | **20**           | 10        | 10         | 10       |
+| Hygiene (20)    | **20**         | **20**           | 16        | 10         | 7        |
+| Git (15)        | **15**         | **15**           | 2         | 2          | 2        |
+| Quality (15)    | 10             | 10               | **15**    | 10         | 10       |
+| Total           | **95**         | **95**           | **73**    | **62**     | **59**   |
+| **Cost**        | **~$20**       | **~$6**          | ?         | ?          | ?        |
+| **Time**        | ~30-40 min     | 43 min           | ?         | ?          | ?        |
 
 <details>
 <summary><strong>Full benchmark details and reproduction</strong></summary>
@@ -164,13 +343,13 @@ Same prompt tested across multiple agents. SWE-AF with Claude runtime (haiku-cla
 
 ### Scoring framework
 
-| Dimension | Points | What it measures |
-|-----------|--------|-----------------|
-| Functional | 30 | CLI behavior and passing tests |
-| Structure | 20 | Modular source layout and test organization |
-| Hygiene | 20 | `.gitignore`, clean status, no junk artifacts |
-| Git | 15 | Commit discipline and message quality |
-| Quality | 15 | Error handling, package metadata, README quality |
+| Dimension  | Points | What it measures                                 |
+| ---------- | ------ | ------------------------------------------------ |
+| Functional | 30     | CLI behavior and passing tests                   |
+| Structure  | 20     | Modular source layout and test organization      |
+| Hygiene    | 20     | `.gitignore`, clean status, no junk artifacts    |
+| Git        | 15     | Commit discipline and message quality            |
+| Quality    | 15     | Error handling, package metadata, README quality |
 
 ### Reproduction
 
@@ -212,13 +391,13 @@ curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
 JSON
 
 # Claude Code (haiku)
-claude -p "Build a Node.js CLI todo app ..." --model haiku --dangerously-skip-permissions
+claude -p "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --model haiku --dangerously-skip-permissions
 
 # Claude Code (sonnet)
-claude -p "Build a Node.js CLI todo app ..." --model sonnet --dangerously-skip-permissions
+claude -p "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --model sonnet --dangerously-skip-permissions
 
 # Codex (gpt-5.3-codex)
-codex exec "Build a Node.js CLI todo app ..." --full-auto
+codex exec "Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work." --full-auto
 ```
 
 **MiniMax M2.5 Measured Metrics (Feb 2026):**
@@ -232,121 +411,77 @@ Benchmark assets, logs, evaluator, and generated projects live in [`examples/age
 
 </details>
 
-## Real-World Examples
-
-### PR #179: Go SDK — Built Entirely by SWE-AF
-
-[PR #179: Go SDK DID/VC Registration](https://github.com/Agent-Field/agentfield/pull/179) — one API call, zero human code, haiku-class models.
-
-| Metric | Value |
-|--------|-------|
-| Issues completed | 10/10 |
-| Tests passing | 217 |
-| Acceptance criteria | 34/34 |
-| Agent invocations | 79 |
-| Model | `claude-haiku-4-5` |
-| **Total cost** | **$19.23** |
-
-<details>
-<summary>Cost breakdown by agent role</summary>
-
-| Role | Cost | % |
-|------|------|---|
-| Coder | $5.88 | 30.6% |
-| Code Reviewer | $3.48 | 18.1% |
-| QA | $1.78 | 9.2% |
-| GitHub PR | $1.66 | 8.6% |
-| Integration Tester | $1.59 | 8.3% |
-| Merger | $1.22 | 6.3% |
-| Workspace Ops | $1.77 | 9.2% |
-| Planning (PM + Arch + TL + Sprint) | $0.79 | 4.1% |
-| Verifier + Finalize | $0.34 | 1.8% |
-| Synthesizer | $0.05 | 0.2% |
-
-79 invocations, 2,070 conversation turns. Planning agents scope and decompose; coders work in parallel isolated worktrees; reviewers and QA validate each issue; merger integrates branches; verifier checks acceptance criteria against the PRD.
-
-</details>
-
-### Autonomous Build Spotlight
+## Docker
 
-Rust-based Python compiler benchmark (built autonomously):
-
-| Metric | CPython (subprocess) | RustPython (SWE-AF) | Improvement |
-|--------|---------------------|---------------------|-------------|
-| Steady-state execution | Baseline (~19ms) | Optimized in-process runtime | **88.3x–602.3x faster** |
-| Geometric mean | 1.0x baseline | 253.8x | **253.8x** |
-| Peak throughput | ~52 ops/s | 31,807 ops/s | **~612x** |
-
-<details>
-<summary>Measurement methodology</summary>
-
-Throughput comparison measures different execution models: CPython subprocess spawn (~19ms per call → ~52 ops/s) vs RustPython pre-warmed interpreter pool (in-process). This is the real-world tradeoff the system was built to optimize — replacing repeated subprocess invocations with a persistent pool for short-snippet execution.
-
-</details>
-
-Artifact trail includes **175 tracked autonomous agents** across planning, coding, review, merge, and verification.
-
-Details: [`examples/llm-rust-python-compiler-sonnet/README.md`](examples/llm-rust-python-compiler-sonnet/README.md)
-
-## Operating Modes
-
-SWE-AF works in two modes: point it at a single repository, or orchestrate coordinated changes across multiple repos in one build.
+```bash
+cp .env.example .env
+# Add your API key: ANTHROPIC_API_KEY, OPENROUTER_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY
+# Optionally add GH_TOKEN for draft PR workflow
 
-### Single-Repository Mode
+docker compose up -d
+```
 
-The default. Pass `repo_url` (remote) or `repo_path` (local) and SWE-AF handles everything:
+Submit a build:
 
 ```bash
+# Default (Claude)
 curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
   -H "Content-Type: application/json" \
-  -d '{
-    "input": {
-      "goal": "Add JWT auth",
-      "repo_url": "https://github.com/user/my-project"
-    }
-  }'
-```
-
-### Multi-Repository Mode
-
-When your work spans multiple codebases — a primary app plus shared libraries, monorepo sub-projects, or dependent microservices — pass `config.repos` as an array with roles:
+  -d @- <<'JSON'
+{
+  "input": {
+    "goal": "Add JWT auth",
+    "repo_url": "https://github.com/user/my-repo"
+  }
+}
+JSON
 
-```bash
+# With open-source runtime (set OPENROUTER_API_KEY in .env)
 curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
   -H "Content-Type: application/json" \
-  -d '{
-    "input": {
-      "goal": "Add JWT auth across API and shared-lib",
-      "config": {
-        "repos": [
-          {
-            "repo_url": "https://github.com/org/main-app",
-            "role": "primary"
-          },
-          {
-            "repo_url": "https://github.com/org/shared-lib",
-            "role": "dependency"
-          }
-        ],
-        "runtime": "claude_code",
-        "models": { "default": "sonnet" }
+  -d @- <<'JSON'
+{
+  "input": {
+    "goal": "Add JWT auth",
+    "repo_url": "https://github.com/user/my-repo",
+    "config": {
+      "runtime": "open_code",
+      "models": {
+        "default": "openrouter/minimax/minimax-m2.5"
       }
     }
-  }'
+  }
+}
+JSON
+
+# Local workspace mode (repo_path)
+curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
+  -H "Content-Type: application/json" \
+  -d @- <<'JSON'
+{
+  "input": {
+    "goal": "Add JWT auth",
+    "repo_path": "/workspaces/my-repo"
+  }
+}
+JSON
 ```
 
-**Roles:**
-- `primary` — The main application. Changes here drive the build; failures block progress.
-- `dependency` — Libraries or services modified to support the primary repo. Failures are captured but don't block.
+Scale workers:
 
-**Use cases:**
-- Primary app + shared SDK or utilities library
-- Monorepo sub-projects that live in separate repos
-- Feature spanning multiple microservices (e.g., API + worker queue)
+```bash
+docker compose up --scale swe-agent=3 -d
+```
 
-## One-Call DX
+Use a host control plane instead of Docker control-plane service:
+
+```bash
+docker compose -f docker-compose.local.yml up -d
+```
 
-Every build is a single API call. Swap runtimes and assign models per agent role in one flat config:
+## GitHub Repo Workflow (Clone -> Build -> Draft PR)
+
+Pass `repo_url` instead of `repo_path` to let SWE-AF clone and open a draft PR after execution.
 
 ```bash
 curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
@@ -354,35 +489,25 @@ curl -X POST http://localhost:8080/api/v1/execute/async/swe-planner.build \
   -d @- <<'JSON'
 {
   "input": {
-    "goal": "Refactor and harden auth + billing flows",
     "repo_url": "https://github.com/user/my-project",
+    "goal": "Add comprehensive test coverage",
     "config": {
       "runtime": "claude_code",
       "models": {
         "default": "sonnet",
         "coder": "opus",
         "qa": "opus"
-      },
-      "enable_learning": true
+      }
     }
   }
 }
 JSON
 ```
 
-**Runtimes supported:**
-- `runtime: "claude_code"` — Claude backend (Anthropic)
-- `runtime: "open_code"` — OpenCode backend (OpenRouter/OpenAI/Google/Anthropic model IDs)
-
-For OpenRouter, use model IDs in `openrouter/<provider>/<model>` format (e.g., `openrouter/minimax/minimax-m2.5`).
-
-## GitHub Repo Workflow
-
-Pass `repo_url` instead of `repo_path` to let SWE-AF clone the repo and open a draft PR after the build completes.
-
 Requirements:
-- `GH_TOKEN` environment variable with `repo` scope
-- Token must have access to the target repository
+
+- `GH_TOKEN` in `.env` with `repo` scope
+- Repo access for that token
 
 ## API Reference
 
@@ -392,7 +517,7 @@ Requirements:
 Core async endpoints (returns an `execution_id` immediately):
 
 ```bash
-# Full build: plan → execute → verify
+# Full build: plan -> execute -> verify
 POST /api/v1/execute/async/swe-planner.build
 
 # Plan only
@@ -420,24 +545,24 @@ Every specialist is also callable directly:
 <details>
 <summary><strong>Agent execution flow</strong></summary>
 
-| Agent | In → Out |
-|-------|----------|
-| `run_product_manager` | goal → PRD |
-| `run_architect` | PRD → architecture |
-| `run_tech_lead` | architecture → review |
-| `run_sprint_planner` | architecture → issue DAG |
-| `run_issue_writer` | issue spec → detailed issue |
-| `run_coder` | issue + worktree → code + tests + commit |
-| `run_qa` | worktree → test results |
-| `run_code_reviewer` | worktree → quality/security review |
-| `run_qa_synthesizer` | QA + review → FIX / APPROVE / BLOCK |
-| `run_issue_advisor` | failure context → adapt / split / accept / escalate |
-| `run_replanner` | build state + failures → restructured plan |
-| `run_merger` | branches → merged output |
-| `run_integration_tester` | merged repo → integration results |
-| `run_verifier` | repo + PRD → acceptance pass/fail |
-| `generate_fix_issues` | failed criteria → targeted fix issues |
-| `run_github_pr` | branch → push + draft PR |
+| Agent                    | In -> Out                                            |
+| ------------------------ | ---------------------------------------------------- |
+| `run_product_manager`    | goal -> PRD                                          |
+| `run_architect`          | PRD -> architecture                                  |
+| `run_tech_lead`          | architecture -> review                               |
+| `run_sprint_planner`     | architecture -> issue DAG                            |
+| `run_issue_writer`       | issue spec -> detailed issue                         |
+| `run_coder`              | issue + worktree -> code + tests + commit            |
+| `run_qa`                 | worktree -> test results                             |
+| `run_code_reviewer`      | worktree -> quality/security review                  |
+| `run_qa_synthesizer`     | QA + review -> FIX / APPROVE / BLOCK                 |
+| `run_issue_advisor`      | failure context -> adapt / split / accept / escalate |
+| `run_replanner`          | build state + failures -> restructured plan          |
+| `run_merger`             | branches -> merged output                            |
+| `run_integration_tester` | merged repo -> integration results                   |
+| `run_verifier`           | repo + PRD -> acceptance pass/fail                   |
+| `generate_fix_issues`    | failed criteria -> targeted fix issues               |
+| `run_github_pr`          | branch -> push + draft PR                            |
 
 </details>
 
@@ -446,23 +571,23 @@ Every specialist is also callable directly:
 
 Pass `config` to `build` or `execute`. Full schema: [`swe_af/execution/schemas.py`](swe_af/execution/schemas.py)
 
-| Key | Default | Description |
-|-----|---------|-------------|
-| `runtime` | `"claude_code"` | Model runtime: `"claude_code"` or `"open_code"` |
-| `models` | `null` | Flat role→model map (`default` + role keys below) |
-| `max_coding_iterations` | `5` | Inner-loop retry budget |
-| `max_advisor_invocations` | `2` | Middle-loop advisor budget |
-| `max_replans` | `2` | Build-level replanning budget |
-| `enable_issue_advisor` | `true` | Enable issue adaptation |
-| `enable_replanning` | `true` | Enable global replanning |
-| `enable_learning` | `false` | Enable cross-issue shared memory (continual learning) |
-| `agent_timeout_seconds` | `2700` | Per-agent timeout |
-| `agent_max_turns` | `150` | Tool-use turn budget |
+| Key                       | Default         | Description                                           |
+| ------------------------- | --------------- | ----------------------------------------------------- |
+| `runtime`                 | `"claude_code"` | Model runtime: `"claude_code"` or `"open_code"`       |
+| `models`                  | `null`          | Flat role-model map (`default` + role keys below)     |
+| `max_coding_iterations`   | `5`             | Inner-loop retry budget                               |
+| `max_advisor_invocations` | `2`             | Middle-loop advisor budget                            |
+| `max_replans`             | `2`             | Build-level replanning budget                         |
+| `enable_issue_advisor`    | `true`          | Enable issue adaptation                               |
+| `enable_replanning`       | `true`          | Enable global replanning                              |
+| `enable_learning`         | `false`         | Enable cross-issue shared memory (continual learning) |
+| `agent_timeout_seconds`   | `2700`          | Per-agent timeout                                     |
+| `agent_max_turns`         | `150`           | Tool-use turn budget                                  |
 
 </details>
 
 <details>
-<summary><strong>Model role keys</strong></summary>
+<summary><strong>Model Role Keys</strong></summary>
 
 `models` supports:
 
@@ -472,7 +597,12 @@ Pass `config` to `build` or `execute`. Full schema: [`swe_af/execution/schemas.p
 - `replan`, `retry_advisor`, `issue_writer`, `issue_advisor`
 - `verifier`, `git`, `merger`, `integration_tester`
 
-Resolution order: `runtime defaults` < `models.default` < `models.<role>`
+</details>
+
+<details>
+<summary><strong>Resolution order</strong></summary>
+
+`runtime defaults` < `models.default` < `models.<role>`
 
 </details>
 
@@ -532,7 +662,7 @@ make clean-examples
 </details>
 
 <details>
-<summary><strong>Security and community</strong></summary>
+<summary><strong>Security and Community</strong></summary>
 
 - Contribution guide: [`docs/CONTRIBUTING.md`](docs/CONTRIBUTING.md)
 - Code of conduct: [`CODE_OF_CONDUCT.md`](CODE_OF_CONDUCT.md)
@@ -544,4 +674,4 @@ make clean-examples
 
 ---
 
-SWE-AF is built on [AgentField](https://github.com/Agent-Field/agentfield) — a first step from single-agent harnesses to autonomous software engineering factories.
+SWE-AF is built on [AgentField](https://github.com/Agent-Field/agentfield) as a first step from single-agent harnesses to autonomous software engineering factories.