From d2d4f995a0822385c6e16fd1f48dcb1d83d5cd9c Mon Sep 17 00:00:00 2001
From: Pahe-Digital <phredondo@gmail.com>
Date: Sun, 1 Mar 2026 18:11:29 -0300
Subject: [PATCH] feat(squad-creator): add internalization quality system
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem: When creating agents from external reference documents
(PDF, DOCX, MD), squad-creator would create agents with a
`dependencies.context` field that loaded the source file at
runtime instead of embedding the intelligence in the agent YAML.
This caused agents to break when source files were moved/deleted,
and created inconsistent behavior on repeated runs.

Solution: 4-layer system to enforce and verify internalization.

Changes:
- squad-creator (agent): add 2 CRITICAL core_principles blocking
  dependencies.context; add knowledge_extraction_process section
  with 6-step distillation (MAP→EXTRACT→TRANSFORM→STRUCTURE→
  VERIFY→DISCARD), quality_signals, and anti_patterns

- agent-template.md: add INTERNALIZATION RULE comment in
  dependencies block — blocks addition of context: key at template
  level, making the constraint visible at creation time

- squad-creator-extend.md: add Step 2.5 Internalization Gate with
  interactive flow for agents created from external references;
  hard rule: generated agent MUST NOT contain dependencies.context

- squad-creator-create.md: add RULE note in flow step 4 for agent
  generation from external references

- squad-creator-validate.md: add Validation Check #6 (Internalization
  Gate) with validateInternalization(); add 3 new error codes:
  AGENT_EXTERNAL_DEP (Error), AGENT_SHALLOW_PRINCIPLES (Warning),
  AGENT_SKELETAL_PERSONA (Warning)

- squad-creator-analyze.md: add Step 4.5 internalization score
  calculation; add Internalization metric line to coverage report

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .../development/agents/squad-creator.md       | 22 +++++++++++++
 .../tasks/squad-creator-analyze.md            | 25 +++++++++++++++
 .../development/tasks/squad-creator-create.md |  1 +
 .../development/tasks/squad-creator-extend.md | 32 ++++++++++++++++++-
 .../tasks/squad-creator-validate.md           | 13 +++++++-
 .../templates/squad/agent-template.md         |  3 ++
 .claude/commands/AIOS/agents/squad-creator.md | 24 ++++++++++++--
 .codex/agents/squad-creator.md                | 24 ++++++++++++--
 .gemini/rules/AIOS/agents/squad-creator.md    | 24 ++++++++++++--
 9 files changed, 160 insertions(+), 8 deletions(-)

diff --git a/.aios-core/development/agents/squad-creator.md b/.aios-core/development/agents/squad-creator.md
index e48611ca3..e98ab0c99 100644
--- a/.aios-core/development/agents/squad-creator.md
+++ b/.aios-core/development/agents/squad-creator.md
@@ -96,6 +96,28 @@ core_principles:
   - CRITICAL: Use JSON Schema for manifest validation
   - CRITICAL: Support 3-level distribution (Local, aios-squads, Synkra API)
   - CRITICAL: Integrate with existing squad-loader and squad-validator
+  - CRITICAL: NEVER create dependencies.context in agent definitions — all intelligence must be internalized in the agent YAML
+  - CRITICAL: When creating agents from external references (PDF/DOCX/MD), apply knowledge_extraction_process to embed knowledge — never point the agent back to the source file
+
+knowledge_extraction_process:
+  description: "When creating an agent from an external reference document, follow this 6-step distillation process to ensure all intelligence is fully internalized."
+  steps:
+    MAP: "Identify all sections, rules, frameworks, and concepts in the source document"
+    EXTRACT: "Extract only actionable intelligence — discard prose, keep operational directives"
+    TRANSFORM: "Convert free-form content into structured YAML (core_principles, commands, workflows, vocabulary)"
+    STRUCTURE: "Organize into agent sections: persona, core_principles, operational_data, commands"
+    VERIFY: "Cross-check that every key concept from the source is represented in the YAML — nothing lost"
+    DISCARD: "Source file is no longer needed — the agent YAML is the single source of truth"
+  quality_signals:
+    - "Agent can operate fully without the source document"
+    - "Every command has concrete behavioral instructions, not vague descriptions"
+    - "Vocabulary, tone, and anti-patterns are explicitly enumerated"
+    - "No reference to loading external files at runtime"
+  anti_patterns:
+    - "Copy-pasting raw document sections into YAML values (still unstructured)"
+    - "Creating a dependencies.context field pointing back to the source"
+    - "Summarizing instead of extracting (loses critical operational detail)"
+    - "Embedding prose paragraphs instead of actionable operational directives"
 
 # All commands require * prefix when used (e.g., *help)
 commands:
diff --git a/.aios-core/development/tasks/squad-creator-analyze.md b/.aios-core/development/tasks/squad-creator-analyze.md
index fdeff4328..d99dd5243 100644
--- a/.aios-core/development/tasks/squad-creator-analyze.md
+++ b/.aios-core/development/tasks/squad-creator-analyze.md
@@ -56,6 +56,7 @@ Checklist:
   - "[ ] Load squad.yaml manifest"
   - "[ ] Inventory components by type"
   - "[ ] Calculate coverage metrics"
+  - "[ ] Calculate internalization score (scan agents for dependencies.context)"
   - "[ ] Generate improvement suggestions"
   - "[ ] Format and display report"
 ---
@@ -170,6 +171,29 @@ const coverage = analyzer.calculateCoverage(inventory, manifest);
 // }
 ```
 
+### Step 4.5: Calculate Internalization Score
+
+```javascript
+const internalization = await analyzer.calculateInternalizationScore(squadPath, inventory);
+
+// Scans each agent file for:
+// 1. Presence of 'dependencies.context' key (CRITICAL violation — score = 0 for that agent)
+// 2. core_principles count (< 3 = shallow, >= 5 = good, >= 8 = excellent)
+// 3. persona.role/identity length (< 30 chars = skeletal)
+// 4. commands with substantive descriptions (< 10 chars = missing guidance)
+
+// Expected structure:
+// {
+//   total_agents: 3,
+//   fully_internalized: 3,
+//   has_external_deps: 0,       // agents with dependencies.context
+//   shallow_principles: 0,       // agents with < 3 core_principles
+//   skeletal_personas: 0,        // agents with brief role/identity
+//   score_percentage: 100,       // 0-100
+//   violations: []               // list of agent files with issues
+// }
+```
+
 ### Step 5: Generate Suggestions
 
 ```javascript
@@ -235,6 +259,7 @@ Coverage
   Tasks: {bar} {percentage}% ({details})
   Config: {bar} {percentage}% ({details})
   Docs: {bar} {percentage}% ({details})
+  Internalization: {bar} {score_percentage}% ({fully_internalized}/{total_agents} agents) {violations_indicator}
 
 Suggestions
   1. {suggestion-1}
diff --git a/.aios-core/development/tasks/squad-creator-create.md b/.aios-core/development/tasks/squad-creator-create.md
index 58251f0b0..35ea1b526 100644
--- a/.aios-core/development/tasks/squad-creator-create.md
+++ b/.aios-core/development/tasks/squad-creator-create.md
@@ -202,6 +202,7 @@ tags:
    ├── Generate squad.yaml from template
    ├── Generate config files
    ├── Generate example agent (if requested)
+   │   └── RULE: If agent is based on external reference, apply knowledge_extraction_process first
    ├── Generate example task (if requested)
    └── Add .gitkeep to empty directories
 
diff --git a/.aios-core/development/tasks/squad-creator-extend.md b/.aios-core/development/tasks/squad-creator-extend.md
index 3b5f7ce33..3831c6eeb 100644
--- a/.aios-core/development/tasks/squad-creator-extend.md
+++ b/.aios-core/development/tasks/squad-creator-extend.md
@@ -56,9 +56,10 @@ Checklist:
   - "[ ] Validate squad exists"
   - "[ ] Collect component type"
   - "[ ] Collect component name and metadata"
+  - "[ ] If adding agent with external reference: apply knowledge_extraction_process before generating"
   - "[ ] Create file from template"
   - "[ ] Update squad.yaml manifest"
-  - "[ ] Run validation"
+  - "[ ] Run validation (includes internalization gate)"
   - "[ ] Display result and next steps"
 ---
 
@@ -190,6 +191,35 @@ if (componentType === 'task' && !agentId) {
 }
 ```
 
+### Step 2.5: Internalization Gate (Agents Only)
+
+When the component type is `agent`, ask if the user has an external reference document:
+
+```
+? Are you creating this agent based on an external reference file (PDF, DOCX, MD)?
+  1. No — I'll define the agent from scratch
+  2. Yes — I have a source document to extract from
+
+> 2
+
+? Provide path or paste the key content sections you want to internalize:
+> [user provides content]
+```
+
+If the user provides a reference, apply `knowledge_extraction_process` BEFORE generating the file:
+
+```
+1. MAP    → List all key concepts, rules, frameworks in the source
+2. EXTRACT → Identify actionable intelligence (vocabulary, core_principles, commands, workflows)
+3. TRANSFORM → Convert to YAML structures
+4. STRUCTURE → Place in correct agent sections (persona, core_principles, operational_data)
+5. VERIFY  → Confirm all key concepts are represented — nothing lost
+6. DISCARD → Source file is NOT referenced — agent YAML is the sole source of truth
+```
+
+**HARD RULE:** The generated agent file MUST NOT contain a `dependencies.context` key.
+If you are tempted to add one, that is a signal that EXTRACT/TRANSFORM steps were skipped.
+
 ### Step 3: Create Component File
 
 ```javascript
diff --git a/.aios-core/development/tasks/squad-creator-validate.md b/.aios-core/development/tasks/squad-creator-validate.md
index e246aa44f..5bfd85955 100644
--- a/.aios-core/development/tasks/squad-creator-validate.md
+++ b/.aios-core/development/tasks/squad-creator-validate.md
@@ -67,6 +67,13 @@ Validates a squad against the JSON Schema and TASK-FORMAT-SPECIFICATION-V1.
 - Warns if project-level reference doesn't exist
 - Errors if local reference doesn't exist
 
+### 6. Internalization Gate
+- Scans all agent files in `agents/` for presence of `dependencies.context` key
+- **CRITICAL ERROR** if any agent contains `dependencies.context` — external runtime dependencies are forbidden
+- Checks that agent YAML has substantive `core_principles` (not empty or single-item)
+- Checks that agent `commands` have descriptions longer than 10 chars (signals real behavioral guidance)
+- Warns if agent `persona` sections are skeletal (role/identity under 30 chars)
+
 ## Flow
 
 ```
@@ -79,7 +86,8 @@ Validates a squad against the JSON Schema and TASK-FORMAT-SPECIFICATION-V1.
    ├── validateStructure() → Directory check
    ├── validateTasks() → Task format check
    ├── validateAgents() → Agent format check
-   └── validateConfigReferences() → Config path check (SQS-10)
+   ├── validateConfigReferences() → Config path check (SQS-10)
+   └── validateInternalization() → Internalization gate (forbidden dependencies.context)
 
 3. Format and display result
    ├── Show errors (if any)
@@ -120,6 +128,9 @@ Result: VALID (with warnings)
 | `TASK_MISSING_FIELD` | Warning | Task missing recommended field |
 | `AGENT_INVALID_FORMAT` | Warning | Agent file may not follow format |
 | `INVALID_NAMING` | Warning | Filename not in kebab-case |
+| `AGENT_EXTERNAL_DEP` | Error | Agent contains dependencies.context (forbidden — must internalize) |
+| `AGENT_SHALLOW_PRINCIPLES` | Warning | Agent core_principles is empty or has fewer than 3 entries |
+| `AGENT_SKELETAL_PERSONA` | Warning | Agent persona.role/identity are too brief to be operational |
 
 ## Implementation
 
diff --git a/.aios-core/development/templates/squad/agent-template.md b/.aios-core/development/templates/squad/agent-template.md
index e58c9f550..fb23cb46a 100644
--- a/.aios-core/development/templates/squad/agent-template.md
+++ b/.aios-core/development/templates/squad/agent-template.md
@@ -43,6 +43,9 @@ commands:
     description: "Exit agent mode"
 
 dependencies:
+  # INTERNALIZATION RULE: NEVER add a 'context:' key here.
+  # External documents (PDF, DOCX, MD) must be fully extracted and embedded
+  # in this agent YAML using knowledge_extraction_process — never referenced at runtime.
   tasks: []
   templates: []
   checklists: []
diff --git a/.claude/commands/AIOS/agents/squad-creator.md b/.claude/commands/AIOS/agents/squad-creator.md
index c700fe37b..e98ab0c99 100644
--- a/.claude/commands/AIOS/agents/squad-creator.md
+++ b/.claude/commands/AIOS/agents/squad-creator.md
@@ -96,6 +96,28 @@ core_principles:
   - CRITICAL: Use JSON Schema for manifest validation
   - CRITICAL: Support 3-level distribution (Local, aios-squads, Synkra API)
   - CRITICAL: Integrate with existing squad-loader and squad-validator
+  - CRITICAL: NEVER create dependencies.context in agent definitions — all intelligence must be internalized in the agent YAML
+  - CRITICAL: When creating agents from external references (PDF/DOCX/MD), apply knowledge_extraction_process to embed knowledge — never point the agent back to the source file
+
+knowledge_extraction_process:
+  description: "When creating an agent from an external reference document, follow this 6-step distillation process to ensure all intelligence is fully internalized."
+  steps:
+    MAP: "Identify all sections, rules, frameworks, and concepts in the source document"
+    EXTRACT: "Extract only actionable intelligence — discard prose, keep operational directives"
+    TRANSFORM: "Convert free-form content into structured YAML (core_principles, commands, workflows, vocabulary)"
+    STRUCTURE: "Organize into agent sections: persona, core_principles, operational_data, commands"
+    VERIFY: "Cross-check that every key concept from the source is represented in the YAML — nothing lost"
+    DISCARD: "Source file is no longer needed — the agent YAML is the single source of truth"
+  quality_signals:
+    - "Agent can operate fully without the source document"
+    - "Every command has concrete behavioral instructions, not vague descriptions"
+    - "Vocabulary, tone, and anti-patterns are explicitly enumerated"
+    - "No reference to loading external files at runtime"
+  anti_patterns:
+    - "Copy-pasting raw document sections into YAML values (still unstructured)"
+    - "Creating a dependencies.context field pointing back to the source"
+    - "Summarizing instead of extracting (loses critical operational detail)"
+    - "Embedding prose paragraphs instead of actionable operational directives"
 
 # All commands require * prefix when used (e.g., *help)
 commands:
@@ -340,5 +362,3 @@ Type `*help` to see all commands, or `*guide` for detailed usage.
 - **@devops (Gage)** - Handles deployment
 
 ---
----
-*AIOS Agent - Synced from .aios-core/development/agents/squad-creator.md*
diff --git a/.codex/agents/squad-creator.md b/.codex/agents/squad-creator.md
index c700fe37b..e98ab0c99 100644
--- a/.codex/agents/squad-creator.md
+++ b/.codex/agents/squad-creator.md
@@ -96,6 +96,28 @@ core_principles:
   - CRITICAL: Use JSON Schema for manifest validation
   - CRITICAL: Support 3-level distribution (Local, aios-squads, Synkra API)
   - CRITICAL: Integrate with existing squad-loader and squad-validator
+  - CRITICAL: NEVER create dependencies.context in agent definitions — all intelligence must be internalized in the agent YAML
+  - CRITICAL: When creating agents from external references (PDF/DOCX/MD), apply knowledge_extraction_process to embed knowledge — never point the agent back to the source file
+
+knowledge_extraction_process:
+  description: "When creating an agent from an external reference document, follow this 6-step distillation process to ensure all intelligence is fully internalized."
+  steps:
+    MAP: "Identify all sections, rules, frameworks, and concepts in the source document"
+    EXTRACT: "Extract only actionable intelligence — discard prose, keep operational directives"
+    TRANSFORM: "Convert free-form content into structured YAML (core_principles, commands, workflows, vocabulary)"
+    STRUCTURE: "Organize into agent sections: persona, core_principles, operational_data, commands"
+    VERIFY: "Cross-check that every key concept from the source is represented in the YAML — nothing lost"
+    DISCARD: "Source file is no longer needed — the agent YAML is the single source of truth"
+  quality_signals:
+    - "Agent can operate fully without the source document"
+    - "Every command has concrete behavioral instructions, not vague descriptions"
+    - "Vocabulary, tone, and anti-patterns are explicitly enumerated"
+    - "No reference to loading external files at runtime"
+  anti_patterns:
+    - "Copy-pasting raw document sections into YAML values (still unstructured)"
+    - "Creating a dependencies.context field pointing back to the source"
+    - "Summarizing instead of extracting (loses critical operational detail)"
+    - "Embedding prose paragraphs instead of actionable operational directives"
 
 # All commands require * prefix when used (e.g., *help)
 commands:
@@ -340,5 +362,3 @@ Type `*help` to see all commands, or `*guide` for detailed usage.
 - **@devops (Gage)** - Handles deployment
 
 ---
----
-*AIOS Agent - Synced from .aios-core/development/agents/squad-creator.md*
diff --git a/.gemini/rules/AIOS/agents/squad-creator.md b/.gemini/rules/AIOS/agents/squad-creator.md
index c700fe37b..e98ab0c99 100644
--- a/.gemini/rules/AIOS/agents/squad-creator.md
+++ b/.gemini/rules/AIOS/agents/squad-creator.md
@@ -96,6 +96,28 @@ core_principles:
   - CRITICAL: Use JSON Schema for manifest validation
   - CRITICAL: Support 3-level distribution (Local, aios-squads, Synkra API)
   - CRITICAL: Integrate with existing squad-loader and squad-validator
+  - CRITICAL: NEVER create dependencies.context in agent definitions — all intelligence must be internalized in the agent YAML
+  - CRITICAL: When creating agents from external references (PDF/DOCX/MD), apply knowledge_extraction_process to embed knowledge — never point the agent back to the source file
+
+knowledge_extraction_process:
+  description: "When creating an agent from an external reference document, follow this 6-step distillation process to ensure all intelligence is fully internalized."
+  steps:
+    MAP: "Identify all sections, rules, frameworks, and concepts in the source document"
+    EXTRACT: "Extract only actionable intelligence — discard prose, keep operational directives"
+    TRANSFORM: "Convert free-form content into structured YAML (core_principles, commands, workflows, vocabulary)"
+    STRUCTURE: "Organize into agent sections: persona, core_principles, operational_data, commands"
+    VERIFY: "Cross-check that every key concept from the source is represented in the YAML — nothing lost"
+    DISCARD: "Source file is no longer needed — the agent YAML is the single source of truth"
+  quality_signals:
+    - "Agent can operate fully without the source document"
+    - "Every command has concrete behavioral instructions, not vague descriptions"
+    - "Vocabulary, tone, and anti-patterns are explicitly enumerated"
+    - "No reference to loading external files at runtime"
+  anti_patterns:
+    - "Copy-pasting raw document sections into YAML values (still unstructured)"
+    - "Creating a dependencies.context field pointing back to the source"
+    - "Summarizing instead of extracting (loses critical operational detail)"
+    - "Embedding prose paragraphs instead of actionable operational directives"
 
 # All commands require * prefix when used (e.g., *help)
 commands:
@@ -340,5 +362,3 @@ Type `*help` to see all commands, or `*guide` for detailed usage.
 - **@devops (Gage)** - Handles deployment
 
 ---
----
-*AIOS Agent - Synced from .aios-core/development/agents/squad-creator.md*