-
Notifications
You must be signed in to change notification settings - Fork 309
fix: Prevent mock data implementations with infrastructure features #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix: Prevent mock data implementations with infrastructure features #95
Conversation
Problem: The coding agent can implement in-memory storage (e.g., `dev-store.ts` with `globalThis`) instead of a real database. These implementations pass all tests because data persists during runtime, but data is lost on server restart. This is a root cause for leonvanzyl#68 - agent "passes" features that don't actually work. Solution: 1. Add 5 mandatory Infrastructure Features (indices 0-4) that run first: - Feature 0: Database connection established - Feature 1: Database schema applied correctly - Feature 2: Data persists across server restart (CRITICAL) - Feature 3: No mock data patterns in codebase - Feature 4: Backend API queries real database 2. Add STEP 5.7: Server Restart Persistence Test to coding prompt: - Create test data, stop server, restart, verify data still exists 3. Extend grep patterns for mock detection in STEP 5.6: - globalThis., devStore, dev-store, mockData, fakeData - TODO.*real, STUB, MOCK, new Map() as data stores Changes: - .claude/templates/initializer_prompt.template.md - Infrastructure features - .claude/templates/coding_prompt.template.md - STEP 5.6/5.7 enhancements - .claude/commands/create-spec.md - Phase 3b database question Backwards Compatible: - Works with YOLO mode (uses bash/grep, not browser automation) - Stateless apps can skip database features via create-spec question Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughAdds a mandatory Phase 3b question about persistent data that branches spec generation for DB-needed/unsure vs stateless, mandates five Infrastructure features (indices 0–4) as prerequisites when DB is required/uncertain, enforces infrastructure-first dependency rules, and expands persistence verification including grep checks and mandatory restart tests. Changes
Sequence Diagram(s)sequenceDiagram
participant User as User
participant SpecGen as SpecGenerator
participant Infra as InfraPlanner
participant Verifier as Verifier
participant DB as Database
User->>SpecGen: Start spec creation
SpecGen->>User: Ask Phase 3b: "Persistent data needed?"
alt DB needed / Unsure
SpecGen->>Infra: Insert mandatory infra features (0-4)
Infra->>DB: Provision/connect database
Infra->>Verifier: Mark infra as prerequisites
Verifier->>DB: Run persistence checks & restart test
Verifier-->>SpecGen: Report persistence status
else Stateless
SpecGen->>SpecGen: Annotate \<database\>none - stateless\</database>
SpecGen->>Infra: Skip DB infra items
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.claude/templates/initializer_prompt.template.md (1)
196-219: Math error in category distribution totals.The TOTAL row (line 219) doesn't match the sum of the category rows:
Simple tier calculation:
- Infrastructure: 5
- Categories A-T: 5+15+20+10+10+10+8+5+5+5+5+8+10+8+8+8+5+5+5+5 = 160
- Actual sum: 165
- Table shows: 155 ❌ (off by 10)
Medium tier calculation:
- Infrastructure: 5
- Categories A-T: 20+25+30+20+15+20+10+10+8+10+8+12+15+10+10+10+8+8+6+5 = 260
- Actual sum: 265
- Table shows: 255 ❌ (off by 10)
Complex tier appears to sum correctly to 405+.
🔢 Corrected totals
-| **TOTAL** | **155** | **255** | **405+** | +| **TOTAL** | **165** | **265** | **405+** |Note: This change cascades to other files:
.claude/commands/create-spec.mdlines 231-233 reference these counts- Feature count examples throughout need updating
🤖 Fix all issues with AI agents
In @.claude/commands/create-spec.md:
- Around line 231-233: The feature/test counts and tier naming are inconsistent
between .claude/commands/create-spec.md (sections "Simple apps", "Medium apps",
"Advanced apps") and initializer_prompt.template.md (tiers with ~155/255/405+
tests and "Complex" naming); choose the authoritative source (or define a single
scope: "features" vs "tests"), standardize the tier name ("Advanced" or
"Complex"), and update both files so the numeric ranges and tier labels match
exactly — e.g., decide whether counts represent features or tests, adjust the
numbers in create-spec.md or initializer_prompt.template.md accordingly, and
ensure the headings and any cross-references use the same tier names.
- Around line 98-117: The spec text "Phase 3b: Database Requirements" currently
promises a "simplified infrastructure" for stateless apps but doesn't define it
and the initializer always creates exactly 5 infrastructure features (indices
0-4); fix by defining the simplified infra and adding conditional initializer
logic: explicitly state that for "<database>none - stateless
application</database>" the spec will create 0 infrastructure persistence tests
(or e.g., 2 lightweight infra checks like build and health endpoints) and show
exactly where the <database> tag sits (under the spec root/database element),
then update the initializer template to branch on the database requirement flag
(e.g., needsDatabase or the <database> value) so that when the value is "none -
stateless application" it generates the simplified feature set instead of
indices 0-4, otherwise generate the full five persistence tests.
In @.claude/templates/coding_prompt.template.md:
- Around line 226-230: After backgrounding ./init.sh and sleeping, add a
health-check to verify the server started: perform an HTTP request to the server
health endpoint (e.g., /api/health) using curl -f (or equivalent) and on failure
emit a clear error ("ERROR: Server failed to start") and exit non‑zero; make the
port configurable or replace PORT with the actual port; update the restart step
that runs ./init.sh & and sleep 15 to include this verification so the script
fails fast when the server never becomes responsive.
In @.claude/templates/initializer_prompt.template.md:
- Around line 167-176: Update the Feature 3 grep steps to match the
coding_prompt template by extending the --include flags to include ".js" in
addition to ".ts" and ".tsx", add "testData" to the mock-data pattern (so the
existing "mockData\|fakeData\|sampleData\|dummyData" becomes
"mockData\|testData\|fakeData\|sampleData\|dummyData"), add the TODO/STUB/MOCK
search pattern (e.g., "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK"), add
development-conditionals search (e.g.,
"isDevelopment\|isDev\|process\.env\.NODE_ENV.*development"), and ensure the
grep for directories that may not exist uses error suppression "2>/dev/null"
where used (apply these changes to the lines containing "globalThis\.",
"dev-store\|devStore\|DevStore\|mock-db\|mockDb",
"mockData\|fakeData\|sampleData\|dummyData", and "new Map()\|new Set()" so the
initializer_prompt.template.md mirrors coding_prompt.template.md).
- Around line 86-112: The example JSON violates the rule that all features after
index 4 must depend on infrastructure indices [0,1,2,3,4]; update each object
with index >4 (e.g., "User can register", "User can login", "User can logout",
"User can create todo", "User can view todos", "User can edit todo", "User can
delete todo", "User can filter todos", "User can search todos") to include
[0,1,2,3,4] in their depends_on_indices arrays (while preserving existing
dependencies like foundation/auth links such as 5,8,9,11,12 to maintain
transitive ordering). Ensure every entry after the INFRASTRUCTURE TIER
explicitly contains [0,1,2,3,4] so the initializer follows the
infrastructure-first requirement.
🧹 Nitpick comments (3)
.claude/templates/coding_prompt.template.md (2)
219-224: Consider more robust server shutdown verification.The current approach has potential timing issues:
- Line 221:
sleep 5may not be sufficient for all server processes to fully terminate, especially under load.- Line 223: The verification
pgrep -f "node" && echo "ERROR..." && exit 1will cause the entire verification to fail if used in a script withset -e, even when successful.♻️ More robust shutdown verification
3. **STOP the server completely:** ```bash - pkill -f "node" || pkill -f "npm" || pkill -f "next" - sleep 5 + # Send SIGTERM first, then SIGKILL if needed + pkill -f "node" || pkill -f "npm" || pkill -f "next" + sleep 3 + pkill -9 -f "node" 2>/dev/null || true + sleep 2 # Verify server is stopped - pgrep -f "node" && echo "ERROR: Server still running!" && exit 1 + if pgrep -f "node" > /dev/null; then + echo "ERROR: Server still running!" + exit 1 + fi ```
236-239: Avoid redundant grep commands - reference STEP 5.6.Lines 237-238 repeat the grep patterns already defined in STEP 5.6. This creates maintenance burden if patterns need to be updated.
♻️ Reference existing step instead
6. **If data is GONE:** Implementation uses in-memory storage → CRITICAL FAIL - - Search for: `grep -r "globalThis\|devStore\|dev-store" src/` - - You MUST fix the mock data implementation before proceeding + - Run all grep commands from STEP 5.6 to identify the mock pattern + - You MUST fix the in-memory storage implementation before proceeding - Replace in-memory storage with real database queries.claude/templates/initializer_prompt.template.md (1)
289-297: Add package.json verification for mock backend libraries.Lines 291-293 list prohibited mock backend tools (
json-server,mirage,msw), but these are npm packages, not code patterns. The grep commands in Feature 3 (lines 170-173) won't catch these.📦 Add package.json check to Feature 3
In the test steps for Feature 3 (lines 167-176), add a package.json check:
Steps: 1. Run: grep -r "globalThis\." --include="*.ts" --include="*.tsx" --include="*.js" src/ 2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" src/ 3. Run: grep -r "mockData\|fakeData\|sampleData\|dummyData" --include="*.ts" src/ 4. Run: grep -r "new Map()\|new Set()" --include="*.ts" src/lib/ src/store/ src/data/ -5. ALL grep commands must return empty (exit code 1) -6. If any returns results → investigate and fix before passing +5. Check package.json for mock backend libraries: + grep -E "json-server|miragejs|msw" package.json +6. ALL checks must return empty (exit code 1) +7. If any returns results → investigate and fix before passingThis catches mock backends installed as dependencies.
| ### Phase 3b: Database Requirements (MANDATORY) | ||
|
|
||
| **Always ask this question regardless of mode:** | ||
|
|
||
| > "One foundational question about data storage: | ||
| > | ||
| > **Does this application need to store user data persistently?** | ||
| > | ||
| > 1. **Yes, needs a database** - Users create, save, and retrieve data (most apps) | ||
| > 2. **No, stateless** - Pure frontend, no data storage needed (calculators, static sites) | ||
| > 3. **Not sure** - Let me describe what I need and you decide" | ||
|
|
||
| **Branching logic:** | ||
|
|
||
| - **If "Yes" or "Not sure"**: Continue normally. The spec will include database in tech stack and the initializer will create 5 mandatory Infrastructure features (indices 0-4) to verify database connectivity and persistence. | ||
|
|
||
| - **If "No, stateless"**: Note this in the spec. Skip database from tech stack. Infrastructure features will be simplified (no database persistence tests). Mark this clearly: | ||
| ```xml | ||
| <database>none - stateless application</database> | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# List files in .claude directory structure
find .claude -type f -name "*.md" | head -20Repository: leonvanzyl/autocoder
Length of output: 607
🏁 Script executed:
#!/bin/bash
# Check if the mentioned files exist and get their sizes
ls -lh .claude/commands/create-spec.md .claude/templates/initializer_prompt.template.md 2>&1Repository: leonvanzyl/autocoder
Length of output: 240
🏁 Script executed:
#!/bin/bash
# Examine the context around lines 98-117 in create-spec.md
sed -n '95,120p' .claude/commands/create-spec.mdRepository: leonvanzyl/autocoder
Length of output: 1151
🏁 Script executed:
#!/bin/bash
# Search for stateless/conditional logic in initializer_prompt.template.md
rg -n "stateless|no database|skip infrastructure|Phase 3b" .claude/templates/initializer_prompt.template.md -C 3Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
#!/bin/bash
# Read initializer_prompt.template.md to see what it contains
wc -l .claude/templates/initializer_prompt.template.mdRepository: leonvanzyl/autocoder
Length of output: 116
🏁 Script executed:
#!/bin/bash
# Read initializer_prompt.template.md contents
cat -n .claude/templates/initializer_prompt.template.mdRepository: leonvanzyl/autocoder
Length of output: 19205
🏁 Script executed:
#!/bin/bash
# Search for infrastructure references in both files
rg -n "infrastructure|Infrastructure" .claude/commands/create-spec.md .claude/templates/initializer_prompt.template.md -C 2Repository: leonvanzyl/autocoder
Length of output: 8294
🏁 Script executed:
#!/bin/bash
# Search broadly for stateless in the entire .claude directory
rg -r "stateless" .claude/Repository: leonvanzyl/autocoder
Length of output: 1267
Reconcile stateless app handling between spec creation and initializer implementation.
The spec creation process mentions "simplified infrastructure" for stateless apps, but the initializer template doesn't implement this scenario:
-
Undefined simplification (line 114): States infrastructure will be "simplified (no database persistence tests)" for stateless apps, but doesn't specify:
- How many infrastructure features should be created (0? Fewer than 5?)
- What "simplified" features look like
- How this affects overall feature count
-
Unclear XML tag placement (lines 115-117): The XML tag
<database>none - stateless application</database>is shown but its location in the spec structure is not defined. -
Missing initializer implementation: The initializer template (lines 39-41, 119-131, 225) has no conditional logic for stateless apps. It always mandates exactly 5 infrastructure features (indices 0-4) for all complexity tiers, regardless of whether the app needs a database. This creates an inconsistency: the spec process promises simplified infrastructure for stateless apps, but the initializer can't deliver it.
Resolve by either:
- Defining what "simplified infrastructure" means (and updating the initializer to handle it), or
- Clarifying that stateless apps still require the 5 infrastructure features but with different test steps
🤖 Prompt for AI Agents
In @.claude/commands/create-spec.md around lines 98 - 117, The spec text "Phase
3b: Database Requirements" currently promises a "simplified infrastructure" for
stateless apps but doesn't define it and the initializer always creates exactly
5 infrastructure features (indices 0-4); fix by defining the simplified infra
and adding conditional initializer logic: explicitly state that for
"<database>none - stateless application</database>" the spec will create 0
infrastructure persistence tests (or e.g., 2 lightweight infra checks like build
and health endpoints) and show exactly where the <database> tag sits (under the
spec root/database element), then update the initializer template to branch on
the database requirement flag (e.g., needsDatabase or the <database> value) so
that when the value is "none - stateless application" it generates the
simplified feature set instead of indices 0-4, otherwise generate the full five
persistence tests.
| - **Simple apps** (todo list, calculator, notes): ~25-55 features (includes 5 infrastructure) | ||
| - **Medium apps** (blog, task manager with auth): ~105 features (includes 5 infrastructure) | ||
| - **Advanced apps** (e-commerce, CRM, full SaaS): ~155-205 features (includes 5 infrastructure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Extract feature count references from both files
echo "=== create-spec.md feature counts ==="
rg "Simple apps|Medium apps|Advanced apps|Complex apps" .claude/commands/create-spec.md -B 1 -A 1
echo ""
echo "=== initializer_prompt.template.md feature counts ==="
rg "Simple|Medium|Complex|Advanced" .claude/templates/initializer_prompt.template.md | grep -E "apps|features|tests|Simple|Medium|Complex|Advanced" | head -30Repository: leonvanzyl/autocoder
Length of output: 735
Feature count inconsistency between create-spec.md and initializer_prompt.template.md.
The feature/test counts for corresponding app tiers don't align:
| Tier | create-spec.md | initializer_prompt.template.md |
|---|---|---|
| Simple | ~25-55 features | ~155 tests |
| Medium | ~105 features | ~255 tests |
| Advanced/Complex | ~155-205 features | ~405+ tests |
Additionally, the tiers use different naming ("Advanced" vs "Complex"), making it unclear whether they represent the same tier. The magnitude differences suggest these may be counting different things or based on different scope definitions. Clarify which counts are authoritative and align both templates accordingly.
🤖 Prompt for AI Agents
In @.claude/commands/create-spec.md around lines 231 - 233, The feature/test
counts and tier naming are inconsistent between .claude/commands/create-spec.md
(sections "Simple apps", "Medium apps", "Advanced apps") and
initializer_prompt.template.md (tiers with ~155/255/405+ tests and "Complex"
naming); choose the authoritative source (or define a single scope: "features"
vs "tests"), standardize the tier name ("Advanced" or "Complex"), and update
both files so the numeric ranges and tier labels match exactly — e.g., decide
whether counts represent features or tests, adjust the numbers in create-spec.md
or initializer_prompt.template.md accordingly, and ensure the headings and any
cross-references use the same tier names.
- Fix math error in category totals (155→165, 255→265) - Fix example JSON to include [0,1,2,3,4] dependencies for all features - Add more robust server shutdown (SIGTERM then SIGKILL) - Add health check after server restart - Align grep patterns between templates (add .js, testData, TODO/STUB/MOCK) - Add package.json check for mock backend libraries - Reference STEP 5.6 instead of duplicating grep commands Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Line 197: The grep pattern uses literal parentheses which are interpreted by
the shell/regex engine, so update the command that searches for "new Map()" and
"new Set()" to escape the parentheses (e.g., escape both '(' and ')' in the
patterns) so the literal strings "new Map()" and "new Set()" are matched; target
the grep invocation shown (the command containing new Map() and new Set() in the
template) and modify the patterns accordingly, or alternatively switch to an
extended regex flag and use escaped or bracketed forms to reliably match the
parentheses.
♻️ Duplicate comments (2)
.claude/templates/initializer_prompt.template.md (1)
171-175: CRITICAL: Incomplete file extension coverage in grep patterns.Lines 171-175 are missing
--include="*.js"in their grep commands. Only line 170 includes JavaScript files, but the remaining patterns (dev-store, mockData, TODO/STUB/MOCK, isDevelopment, Map/Set) only search.tsand.tsxfiles.Impact: If the project contains
.jsfiles with mock data patterns (e.g.,devStorein a.jsfile), they will not be detected, allowing mock implementations to slip through.🐛 Add .js extension to all grep patterns
-2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" src/ -3. Run: grep -r "mockData\|testData\|fakeData\|sampleData\|dummyData" --include="*.ts" --include="*.tsx" src/ -4. Run: grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" src/ -5. Run: grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" src/ -6. Run: grep -r "new Map()\|new Set()" --include="*.ts" --include="*.tsx" src/lib/ src/store/ src/data/ 2>/dev/null +2. Run: grep -r "dev-store\|devStore\|DevStore\|mock-db\|mockDb" --include="*.ts" --include="*.tsx" --include="*.js" src/ +3. Run: grep -r "mockData\|testData\|fakeData\|sampleData\|dummyData" --include="*.ts" --include="*.tsx" --include="*.js" src/ +4. Run: grep -r "TODO.*real\|TODO.*database\|TODO.*API\|STUB\|MOCK" --include="*.ts" --include="*.tsx" --include="*.js" src/ +5. Run: grep -r "isDevelopment\|isDev\|process\.env\.NODE_ENV.*development" --include="*.ts" --include="*.tsx" --include="*.js" src/ +6. Run: grep -r "new Map()\|new Set()" --include="*.ts" --include="*.tsx" --include="*.js" src/lib/ src/store/ src/data/ 2>/dev/null.claude/templates/coding_prompt.template.md (1)
232-238: Health check doesn't fail on server startup failure.The restart verification uses
|| echo "WARNING"which prevents the script from exiting when the server fails to start. This means the test could incorrectly report data loss when the server never actually restarted.🔧 Proposed fix
4. **RESTART the server:** ```bash ./init.sh & sleep 15 # Allow server to fully start # Verify server is responding - curl -f http://localhost:3000/api/health || curl -f http://localhost:3000 || echo "WARNING: Health check failed" + if ! curl -f http://localhost:3000/api/health && ! curl -f http://localhost:3000; then + echo "ERROR: Server failed to start after restart" + exit 1 + fi ```This ensures the test fails fast with a clear error when the server doesn't respond, rather than proceeding to incorrectly report that data was lost.
🧹 Nitpick comments (2)
.claude/templates/initializer_prompt.template.md (1)
136-189: Consider adding language specifiers to fenced code blocks.The infrastructure feature descriptions use fenced code blocks without language specifiers (lines 136, 145, 154, 168, 182). While this doesn't affect functionality, adding language identifiers improves syntax highlighting and follows markdown best practices.
♻️ Add language specifiers
For the test steps, you could use
textorbash(for grep commands):**Feature 0 - Database connection established:** -``` +```text Steps: 1. Start the development server ...Feature 1 - Database schema applied correctly:
-+text
Steps:
...**Feature 2 - Data persists across server restart (CRITICAL):** -``` +```bash Steps: ...Feature 3 - No mock data patterns in codebase:
-+bash
Steps:
...**Feature 4 - Backend API queries real database:** -``` +```text Steps: ...</details> Based on static analysis hints. </blockquote></details> <details> <summary>.claude/templates/coding_prompt.template.md (1)</summary><blockquote> `220-221`: **Consider more specific process matching.** The pattern `pkill -f "node"` matches any process with "node" in its command line, which could unintentionally kill other development servers or node processes running on the system. <details> <summary>💡 More specific alternatives</summary> If the project uses specific npm scripts or has a known structure, you could make the pattern more specific: ```bash # Option 1: Match the specific server script pkill -f "node.*init.sh\|next dev\|npm run dev" || pkill -f "npm" || true # Option 2: Kill by port if known lsof -ti:3000 | xargs kill -TERM 2>/dev/null || true sleep 3 lsof -ti:3000 | xargs kill -9 2>/dev/null || trueHowever, since this is a generic template that needs to work across different project structures, the current broad pattern may be acceptable as a tradeoff for simplicity.
- Escape parentheses in grep patterns: new Map\(\) and new Set\(\) - Add --include="*.js" to all grep commands for complete coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changed from warning-only to proper error handling: - if server doesn't respond after restart, exit with error - prevents false negatives when server fails to start Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Line 237: The health check is hardcoded to port 3000 ("curl -f
http://localhost:3000/api/health" and "http://localhost:3000") but elsewhere the
template uses a PORT placeholder, causing false failures; update the curl checks
to use the PORT variable (or detect the port from init.sh/env) and ensure the
same PORT variable is used at the later line that currently references PORT so
both checks are consistent; target the two curl invocations in the template (the
lines containing "curl -f http://localhost:3000/api/health" and "curl -f
http://localhost:3000") and replace them with a configurable $PORT or a
detection mechanism.
- Line 180: The grep lines in .claude/templates/coding_prompt.template.md claim
to exclude test files but do not actually include exclusion patterns; update
each of the five grep commands to add explicit exclusions such as
--exclude='*.test.*' --exclude='*.spec.*' and --exclude-dir='__tests__' (or
equivalent for your grep implementation) so src/ test files are skipped; ensure
you modify the exact grep command occurrences in the template (the five grep
commands) to include these flags consistently.
🧹 Nitpick comments (1)
.claude/templates/coding_prompt.template.md (1)
197-197: Inconsistent directory scope for Map/Set detection.This grep only searches
src/lib/ src/store/ src/data/, while all other patterns search the entiresrc/directory. In-memory Map/Set usage in othersrc/subdirectories (e.g.,src/components/,src/api/) would be missed.♻️ Align directory scope with other checks
If the intent is to catch Map/Set anywhere in production code:
-grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" src/lib/ src/store/ src/data/ 2>/dev/null +grep -r "new Map\(\)\|new Set\(\)" --include="*.ts" --include="*.tsx" --include="*.js" src/ 2>/dev/nullIf limiting to specific directories is intentional, add a comment explaining why.
leonvanzyl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Review: Infrastructure Features for Mock Data Prevention
Thanks for this PR - it addresses a real and significant problem where in-memory storage implementations pass all tests but fail in production.
Overall Assessment: Approve with Minor Changes Requested
The core approach is sound. CodeRabbit's concerns have been mostly addressed through your follow-up commits. A few issues remain:
Critical Issues
1. Windows Compatibility (HIGH)
STEP 5.7 uses pkill which doesn't exist on Windows:
pkill -f "node" || pkill -f "npm" || pkill -f "next"This will fail silently on Windows (due to || true and 2>/dev/null), leaving the server running while the test proceeds - potentially causing false "data lost" results.
Suggestion: Add a comment noting this is Unix/macOS only, or add Windows alternative:
# Windows: taskkill /F /FI "IMAGENAME eq node.exe" 2>nul || true2. Broad Process Killing Scope (HIGH)
pkill -f "node" kills ALL Node.js processes on the machine, including:
- VS Code (Electron-based)
- Claude Code itself
- Other development servers
- Any Node.js apps running
Suggestion: Use more specific patterns or port-based killing:
# Option 1: More specific pattern
pkill -f "node.*server\|next dev" || true
# Option 2: Kill by port (safer)
lsof -ti :3000 | xargs kill -9 2>/dev/null || trueMinor Issues (Non-Blocking)
-
Port hardcoded to 3000 - Health check uses hardcoded port while other parts of template use
$PORTvariable -
Map/Set pattern - Already scoped to
src/lib/ src/store/ src/data/which is acceptable, butnew Map()/new Set()are legitimate JS patterns that may cause false positives in some projects -
Stateless apps - Phase 3b says "Infrastructure features will be simplified" for stateless apps, but the exact simplified set isn't defined in the initializer template
What's Working Well
- Math totals corrected to 165/265/405+
- Example JSON now includes
[0,1,2,3,4]dependencies correctly - Health check properly exits on failure
- All grep patterns include
.jsextension - Parentheses properly escaped in grep patterns
Recommendation
Address the Windows compatibility note (even just a comment) and consider narrowing the pkill scope. The rest looks good to merge.
… support
Changes:
- Replace pkill -f "node" with port-based killing (lsof -ti :PORT)
- Safer: only kills dev server, not VS Code/Claude Code/other Node apps
- More specific: targets exact port instead of all Node processes
- Add Windows alternative commands (commented, for reference)
- Use ${PORT:-3000} variable instead of hardcoded port 3000
- Update health check and API verification to use PORT variable
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.claude/templates/coding_prompt.template.md:
- Line 197: The grep invocation currently limits the search to "src/lib/
src/store/ src/data/" and may miss occurrences; update the command (the string
"grep -r \"new Map\\(\\)\\|new Set\\(\\)\" --include=\"*.ts\"
--include=\"*.tsx\" --include=\"*.js\" src/lib/ src/store/ src/data/
2>/dev/null") to search the entire src/ tree (i.e., use src/ like the other grep
commands) so it finds in-memory stores in components/services/utils/pages.
🧹 Nitpick comments (2)
.claude/templates/coding_prompt.template.md (2)
184-194: Consider adding word boundaries to reduce false positives.The grep patterns for
mockData,fakeData,sampleData,dummyData,testData(line 188) and similar identifiers could match substrings in unrelated variables (e.g.,mockDatawould matchunmockDataHelperormockDataValidator).Adding word boundaries (
\bin grep with-Eor[[:<:]]/[[:>:]]in basic regex) would make these patterns more precise, though the current patterns may be intentionally permissive to catch variations.♻️ Add word boundaries for precision
# 2. Mock data variables -grep -r "mockData\|fakeData\|sampleData\|dummyData\|testData" --include="*.ts" --include="*.tsx" --include="*.js" src/ +grep -rE "\b(mockData|fakeData|sampleData|dummyData|testData)\b" --include="*.ts" --include="*.tsx" --include="*.js" src/Apply similar changes to other grep commands if desired.
227-229: Clarify when and how to use Windows alternative commands.The Windows alternative for server shutdown is provided in comments, but there's no clear instruction on when agents should use these commands vs. the Unix/macOS
lsofcommands. Additionally, the example shows only the port discovery step (netstat+findstr) but doesn't complete the workflow to actually kill the process.📝 Add usage guidance for Windows commands
- # Windows alternative (use if lsof not available): - # netstat -ano | findstr :${PORT:-3000} | findstr LISTENING - # taskkill /F /PID <pid_from_above> 2>nul + # Windows alternative (use if lsof command fails or on Windows): + # FOR /F "tokens=5" %P IN ('netstat -ano ^| findstr :${PORT:-3000} ^| findstr LISTENING') DO taskkill /F /PID %P 2>nul + # sleep 2Or provide clearer multi-step instructions for extracting and using the PID.
- Changed grep for "new Map()/new Set()" to search all of src/ - Previously only searched src/lib/, src/store/, src/data/ - Now consistent with other grep patterns that search entire src/ - Applied to both coding_prompt and initializer_prompt templates Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.claude/templates/initializer_prompt.template.md:
- Around line 136-142: The fenced code blocks in the initializer prompt template
(the block that begins with the "Steps:" list and the other similar fenced
examples) lack language identifiers which triggers markdownlint MD040; update
each triple-backtick fence to include an appropriate language tag (e.g., ```bash
for shell steps, ```json for JSON examples or ```text if none apply) for the
blocks shown in the template so the linter stops flagging them.
- Around line 153-164: In the "Feature 2 - Data persists across server restart
(CRITICAL)" section replace the unsafe pkill-based shutdown/check (the literal
pkill -f "node" usage) with a port-based stop/start verification: detect the
server process bound to the configured PORT (e.g., via lsof/ss/netstat lookup of
that port or by using an exposed /shutdown endpoint), use the specific PID from
that lookup to stop the server, verify the port is closed (no process listening)
before restart, and then restart and confirm the port is listening again; update
the step list and any example commands in the template to reference the
port-based approach and the template's PORT or server start/stop helper instead
of pkill.
Addresses reviewer feedback: 1. Windows Compatibility: Added Windows alternative using netstat/taskkill 2. Safer Process Killing: Changed from `pkill -f "node"` to port-based killing (`lsof -ti :$PORT`) to avoid killing unrelated Node processes like VS Code, Claude Code, or other development tools Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…r startup health check - Added test suite (tests/test_feature_95_orchestrator_startup_health_check.py) - 18 tests covering all 5 verification steps - Tests for empty database, healthy features, self-references, missing targets - Tests for cycles blocking startup, multiple issue handling - Tests verifying health check is called before feature loop - Added standalone verification script (tests/verify_feature_95.py) - Verifies startup hook exists and is called in run_loop - Verifies features are loaded from database - Verifies validate_dependency_graph() is called with feature dicts - Verifies issues handled by type (auto-fix vs block) - Verifies summary is logged Implementation verified in parallel_orchestrator.py: - _run_dependency_health_check() method runs before feature loop - Loads all features and calls validate_dependency_graph() - Auto-fixes self-references and missing targets - Blocks startup on cycles (returns False) - Logs comprehensive summary of health check results Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…n cycles - Modified _run_dependency_health_check() to return False when cycles are detected - Modified run_loop() to check health check result and exit early on cycles - Added comprehensive test suite (13 tests in tests/test_feature_97_cycle_blocks_startup.py) - Added verification script (tests/verify_feature_97.py) - Updated Feature leonvanzyl#95 tests to expect blocking behavior on cycles Verification steps implemented: 1. Insert features A -> B -> A into database - PASS 2. Attempt to start the orchestrator - PASS (returns False) 3. Verify startup is blocked with clear error message - PASS 4. Verify error message includes the cycle path: [A, B, A] - PASS 5. Verify error message instructs user to remove one dependency - PASS Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…nzyl#100, leonvanzyl#108, leonvanzyl#109, leonvanzyl#110 PR leonvanzyl#110 (Quality Gates): - Move quality checks before DB session to avoid holding locks - Return error instead of None for missing configured custom script - Use contextlib.closing for SQLite connections in progress.py PR leonvanzyl#109 (Rate Limit): - Extract rate limit logic to shared rate_limit_utils.py module - Remove duplicated code from agent.py and test_agent.py PR leonvanzyl#108 (SQLite Parallel): - Sort imports alphabetically in feature_mcp.py PR leonvanzyl#100 (Config Diagnostics): - Add logger.warning for pkill_processes validation failures PR leonvanzyl#95 (Infrastructure Mock): - Add language tags to fenced code blocks in initializer template Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added 'text' language identifier to all fenced code blocks in the Infrastructure Feature Descriptions section to satisfy MD040. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
SUMMARY:
Fixed TypeScript build error in ProjectSetupRequired.tsx where startAgent
was being called with a boolean instead of an options object.
DETAILS:
- The startAgent API function signature was updated (in previous PR merges)
to accept an options object: { yoloMode?, parallelMode?, maxConcurrency?, testingAgentRatio? }
- ProjectSetupRequired.tsx was still calling it with the old signature:
startAgent(projectName, yoloMode) - passing boolean directly
- Changed to: startAgent(projectName, { yoloMode }) - wrapping in options object
This was the only remaining build error after merging 13+ PRs from upstream:
- PR leonvanzyl#112: Security vulnerabilities and race conditions
- PR leonvanzyl#89: Windows subprocess blocking fix
- PR leonvanzyl#109: Rate limit handling with exponential backoff
- PR leonvanzyl#88: MCP server config for ExpandChatSession
- PR leonvanzyl#100: Diagnostic warnings for config loading
- PR leonvanzyl#110: Quality gates (quality_gates.py)
- PR leonvanzyl#113: Structured logging (structured_logging.py)
- PR leonvanzyl#48: Knowledge files support (API, schemas, prompts)
- PR leonvanzyl#29: Feature editing/deletion (MCP tools)
- PR leonvanzyl#45: Chat persistence
- PR leonvanzyl#52: Refactoring feature guidance
- PR leonvanzyl#4: Project reset functionality
- PR leonvanzyl#95: UI polish, health checks, cross-platform fixes
Build now passes: npm run build succeeds with all 2245 modules transformed.
The comment said "excluding test files" but the grep commands didn't actually exclude them. Added common test file exclusion patterns. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Problem
The coding agent can implement in-memory storage (e.g.,
dev-store.tswithglobalThis) instead of a real database. These implementations pass all tests because data persists during runtime, but data is lost on server restart.This is a root cause for #68 - agent "passes" features that don't actually work.
Solution
1. Mandatory Infrastructure Features (indices 0-4)
All projects with a database now include 5 infrastructure features that run FIRST:
All other features depend on indices [0,1,2,3,4], ensuring infrastructure is verified first.
2. STEP 5.7: Server Restart Persistence Test
Added to
coding_prompt.template.md:pkill -f "node")3. Extended Mock Data Detection (STEP 5.6)
Enhanced grep patterns to catch:
globalThis.(in-memory storage)dev-store,devStore,mock-db,mockDbmockData,fakeData,sampleData,dummyDataTODO.*real,STUB,MOCKnew Map()/new Set()as data storesFiles Changed
.claude/templates/initializer_prompt.template.md- Infrastructure features definition.claude/templates/coding_prompt.template.md- STEP 5.6/5.7 enhancements.claude/commands/create-spec.md- Phase 3b database questionBackwards Compatible
Why This Matters
In-memory stores like
globalThis.devStorepass all simple tests because data persists during a single server run. Only a full server restart reveals this bug. Without infrastructure features, the agent can "pass" features with fundamentally broken implementations.Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.