From 988a7eab317e3eff37198ac5a845f0e6022ad8fc Mon Sep 17 00:00:00 2001
From: Richard Hart <rhart@proactive-resolutions.com>
Date: Tue, 27 Jan 2026 10:55:38 -0800
Subject: [PATCH] Add Generated Artifact Verification to reflection framework

Add verification steps for generated artifacts before declaring work complete.
This catches common AI agent failures:
- Cross-references to non-existent tools/APIs
- Sensitive information in committed files (absolute paths, usernames)
- Documentation drift (stale counts, outdated references)
- Claims not verified against actual system state

Changes:
- Add Step 1.6: Generated Artifact Verification with checklist and commands
- Add 4 new items to Refinement Triggers (Dependency/Impact Gaps)
- Add 4 new items to Self-Refine Checklist

Motivation: External review consistently catches issues that self-reflection
misses. These verification steps formalize what external reviewers check.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 plugins/reflexion/commands/reflect.md | 68 ++++++++++++++++++++++++++-
 1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/plugins/reflexion/commands/reflect.md b/plugins/reflexion/commands/reflect.md
index 0e3b0ed..be36a1a 100644
--- a/plugins/reflexion/commands/reflect.md
+++ b/plugins/reflexion/commands/reflect.md
@@ -81,12 +81,60 @@ Before proceeding, evaluate your most recent output against these criteria:
    - [ ] Are there edge cases that haven't been considered?
    - [ ] Could there be unintended side effects?
 
-4. **Fact-Checking Required**
+4. **Dependency & Impact Verification** (CRITICAL - per ISSUE-086, DEC-096)
+   - [ ] For ANY proposed addition/deletion/modification, have you checked for dependencies?
+   - [ ] Have you searched for related decisions (DEC-###) that may be superseded or supersede this?
+   - [ ] Have you checked AUTHORITATIVE.yaml for active evaluations or status?
+   - [ ] Have you searched the ecosystem for files/processes that depend on items being changed?
+   - [ ] If recommending removal of anything, have you verified nothing depends on it?
+
+   **Mandatory Checks Before Recommending Changes:**
+   ```bash
+   # Check for active evaluations/status
+   grep -A20 "item_name" ~/dev/AUTHORITATIVE.yaml | grep -i "status\|evaluation\|active"
+
+   # Check for ecosystem dependencies
+   grep -ri "item_name" ~/dev/infrastructure/ --include="*.md" --include="*.yaml" | head -20
+
+   # Check for related/superseding decisions
+   grep -i "item_name" ~/dev/infrastructure/dev-env-docs/DECISIONS-LOG.md | head -10
+
+   # Check for dedicated project directories
+   find ~/dev/infrastructure -maxdepth 2 -type d -iname "*item_name*" 2>/dev/null
+   ```
+
+   **HARD RULE:** If ANY check reveals active dependencies, evaluations, or pending decisions, FLAG THIS IN THE EVALUATION. Do not approve work that recommends changes without dependency verification.
+
+5. **Fact-Checking Required**
    - [ ] Have you made any claims about performance? (needs verification)
    - [ ] Have you stated any technical facts? (needs source/verification)
    - [ ] Have you referenced best practices? (needs validation)
    - [ ] Have you made security assertions? (needs careful review)
 
+6. **Generated Artifact Verification** (CRITICAL for any generated code/content)
+   - [ ] **Cross-references validated**: Any references to external tools, APIs, or files verified to exist with correct names
+   - [ ] **Security scan**: Generated files checked for sensitive information (absolute paths with usernames, credentials, internal URLs)
+   - [ ] **Documentation sync**: If counts, stats, or references changed, all documentation citing them updated
+   - [ ] **State verification**: Claims about system state verified with actual commands, not memory
+
+   **Verification Commands (run before declaring complete):**
+   ```bash
+   # Cross-reference check: verify tool/API names exist
+   # Example for MCP tools:
+   grep -o 'mcp_[a-z_]*' generated_file.py | sort -u | while read tool; do
+     grep -q "$tool" ~/.config/claude/claude_desktop_config.json || echo "MISSING: $tool"
+   done
+
+   # Security scan: check staged files for sensitive paths (Linux, macOS, Windows)
+   git diff --cached --name-only | xargs grep -l '/home/\|/Users/\|C:\\Users\|%USERPROFILE%' 2>/dev/null
+
+   # Documentation sync: find docs referencing old values after changes
+   # Example: if you changed a count from 117 to 118
+   grep -rn "117" docs/ *.md | grep -i "count\|total\|items"
+   ```
+
+   **HARD RULE:** Do not declare work complete until verification commands confirm claims match reality.
+
 ### Step 2: Decision Point
 
 Based on the assessment above, determine:
@@ -526,7 +574,18 @@ Automatically trigger refinement if any of these conditions are met:
    - No library search for common problems
    - No consideration of existing services
 
-4. **Architecture Violations**
+4. **Dependency/Impact Gaps** (CRITICAL)
+   - Recommended deletion/removal without dependency check
+   - Cited prior decision (DEC-###) without checking for superseding decisions
+   - Proposed config changes without checking AUTHORITATIVE.yaml
+   - Modified ecosystem files without searching for dependents
+   - Any destructive action without PRE-MODIFICATION GATE checks
+   - Generated cross-references without validation against source of truth
+   - Committed files containing absolute paths or usernames
+   - Changed counts/stats without updating referencing documentation
+   - Declared complete without running verification commands
+
+5. **Architecture Violations**
    - Business logic in controllers/views
    - Domain logic depending on infrastructure
    - Unclear boundaries between contexts
@@ -548,6 +607,11 @@ Before finalizing any output:
 - [ ] Did I search for existing libraries before writing custom code?
 - [ ] Is the architecture aligned with Clean Architecture/DDD principles?
 - [ ] Are names domain-specific rather than generic (utils/helpers)?
+- [ ] **CROSS-REFERENCE CHECK:** Any tool/API/file references verified against actual inventory (not assumed)
+- [ ] **SECURITY CHECK:** Generated files scanned for sensitive info (paths, usernames, credentials)
+- [ ] **DOCUMENTATION SYNC:** All docs referencing changed values have been updated
+- [ ] **STATE VERIFICATION:** Claims verified with actual commands, not memory
+- [ ] **DEPENDENCY CHECK:** For any additions/deletions/modifications, have I verified no active dependencies, evaluations, or superseding decisions exist?
 
 ### Reflexion Questions