feat: visual validation, notable components, and Figma context improvements by rubenmarcus · Pull Request #254 · multivmlabs/ralph-starter

rubenmarcus · 2026-02-26T03:49:25Z

Summary

Visual validation: Automatic pixel-level + LLM vision comparison of implementation vs Figma screenshots on final loop iterations, with 2-iteration auto-extension to fix issues
Notable component detection: Identifies scroll indicators, social sidebars, nav elements in Figma sections with exact position hints, dimensions, and scroll behavior
Improved Figma context: Better z-index stacking guidance, STRETCH→cover mapping, scroll indicator patterns, completeness checklist, absolute positioning precision
New modules: dev-server.ts (Vite launcher + screenshot capture), visual-validation.ts (pixelmatch + vision API comparison)
CLI: Added --no-visual-check flag, vision cost tracking in CostTracker

Test plan

Run pnpm build — passes
Run ralph-starter run with a Figma spec that has screenshots in public/images/screenshots/ — visual validation triggers on final iteration
Verify --no-visual-check disables visual validation
Verify notable components appear in generated Figma plans (scroll indicators, sidebars)
Verify cost tracker reports vision API calls

🤖 Generated with Claude Code

…d Figma context - Add visual comparison validation (pixelmatch + LLM vision) that runs on final iterations when Figma screenshots are present, with automatic 2-iteration extension to fix visual issues - Add --no-visual-check CLI flag to disable visual validation - Add dev-server module for launching/capturing implementation screenshots - Detect notable sub-components in Figma sections (scroll indicators, social sidebars, nav elements) with position hints and scroll behavior - Improve context-builder Figma instructions: z-index stacking clarity, STRETCH→cover guidance, scroll indicator patterns, completeness checklist, absolute positioning precision - Enhance plan-generator with inner shadow CSS output, deeper node traversal (depth 8), and text-descendant inclusion at deep levels - Add vision call tracking to CostTracker Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector · 2026-02-26T03:49:32Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

github-actions · 2026-02-26T03:49:32Z

Issue Linking Reminder

This PR doesn't appear to have a linked issue. Consider linking to:

This repo: Closes #123
ralph-ideas: Closes multivmlabs/ralph-ideas#123

Using Closes, Fixes, or Resolves will auto-close the issue when this PR is merged.

If this PR doesn't need an issue, you can ignore this message.

github-actions · 2026-02-26T03:50:03Z

⚠️ Bundle Size Analysis

Metric	Value
Base	2234.84 KB
PR	2314.06 KB
Diff	79.22 KB (3.00%)

Bundle breakdown

156K	dist/auth
32K	dist/automation
4.0K	dist/cli.d.ts
4.0K	dist/cli.d.ts.map
16K	dist/cli.js
12K	dist/cli.js.map
528K	dist/commands
28K	dist/config
4.0K	dist/index.d.ts
4.0K	dist/index.d.ts.map
4.0K	dist/index.js
4.0K	dist/index.js.map
828K	dist/integrations
84K	dist/llm
884K	dist/loop
188K	dist/mcp
32K	dist/presets
92K	dist/setup
40K	dist/skills
392K	dist/sources
76K	dist/ui
124K	dist/utils
336K	dist/wizard

src/loop/visual-validation.ts

+    body: JSON.stringify({
+      model: model || config.defaultModel,
+      max_tokens: 2048,
+      messages: [
+        { role: 'system', content: VISION_SYSTEM_PROMPT },
+        {
+          role: 'user',
+          content: [
+            { type: 'text', text: 'TARGET DESIGN from Figma:' },
+            {
+              type: 'image_url',
+              image_url: { url: `data:image/png;base64,${designBase64}` },
+            },
+            { type: 'text', text: 'CURRENT IMPLEMENTATION:' },
+            {
+              type: 'image_url',
+              image_url: { url: `data:image/png;base64,${implBase64}` },
+            },
+            { type: 'text', text: 'PIXEL DIFF OVERLAY (red = different):' },
+            {
+              type: 'image_url',
+              image_url: { url: `data:image/png;base64,${diffBase64}` },
+            },
+            { type: 'text', text: VISION_COMPARISON_PROMPT },
+          ],
+        },
+      ],
+    }),


greptile-apps · 2026-02-26T03:53:28Z

Greptile Summary

This PR adds a three-layer visual validation pipeline (pixelmatch → LLM Vision → strict pixel gate) that automatically compares implementation screenshots against Figma design screenshots on the final loop iteration, with 2-iteration auto-extension to fix issues. It also introduces notable component detection (scroll indicators, social sidebars, nav elements) in Figma plan generation and improves Figma context with better z-index stacking guidance, STRETCH→cover mapping, and a completeness checklist.

New modules: visual-validation.ts (609 lines — pixelmatch + LLM vision comparison) and dev-server.ts (171 lines — auto-detect and launch dev servers for screenshot capture)
Bug: CostTracker.reset() doesn't clear the new _visionCalls/_visionCost fields, leading to stale vision metrics after reset
Inconsistency: design-spec.ts still emits z-index: 1 in 4 places for text/content layers, while context-builder.ts and plan-generator.ts were updated to recommend z-index: 10+ — this creates contradictory guidance that may confuse the coding agent
Safety concern: dev-server.ts uses null as unknown as ResultPromise for pre-existing servers, which could cause runtime errors if .process is accessed by callers

Confidence Score: 3/5

Largely safe but has a z-index inconsistency bug that directly undermines one of the PR's goals, plus a missing reset of new state fields
The PR is well-structured and the visual validation pipeline is thoughtfully designed with proper fallback behavior. However, two concrete bugs lower confidence: (1) the z-index inconsistency in design-spec.ts contradicts the very fix this PR introduces — the agent will receive "use z-index: 1" from the spec output and "use z-index: 10+" from the context builder simultaneously, which is the stacking issue the PR aims to solve; (2) the CostTracker.reset() not clearing vision state will produce incorrect metrics. The unsafe null cast in dev-server.ts is also a runtime risk.
Pay close attention to src/integrations/figma/parsers/design-spec.ts (z-index inconsistency at lines 608, 670, 674, 704) and src/loop/cost-tracker.ts (incomplete reset method)

Important Files Changed

Filename	Overview
src/loop/visual-validation.ts	New 609-line module implementing a three-layer visual comparison pipeline (pixelmatch → LLM Vision → strict pixel gate). Well-structured with clear separation. Minor concern: auto-installing Playwright globally with npm rather than the project's package manager.
src/loop/dev-server.ts	New module for auto-detecting and launching dev servers. Uses unsafe `null as unknown as ResultPromise` cast when a server is already running on the target port, which could cause runtime errors if `.process` is accessed.
src/loop/executor.ts	Adds visual validation integration to the main loop with 2-iteration auto-extension pattern (mirroring existing build-fix extension). Non-fatal error handling is appropriate.
src/loop/cost-tracker.ts	Adds vision call tracking with `_visionCalls` and `_visionCost` fields. Bug: `reset()` method doesn't clear the new vision tracking state, leading to stale data after reset.
src/integrations/figma/parsers/design-spec.ts	Adds interactive element detection, deeper depth traversal (6→8), and `containsText` filter for deep nodes. Bug: z-index guidance still says `z-index: 1` in 4 places while context-builder was updated to `z-index: 10`, creating contradictory instructions.
src/integrations/figma/parsers/plan-generator.ts	Adds `NotableComponent` detection (indicators, sidebars, nav), inner shadow support, updated z-index stacking guidance. Well-implemented with proper category classification and position hints.
src/loop/context-builder.ts	Updates implementation guidance: STRETCH→cover mapping, z-index 10+ for text layers, scroll indicator patterns, completeness checklist, absolute positioning precision. All changes are prompt text updates — no logic risk.
src/cli.ts	Adds `--no-visual-check` CLI flag. Straightforward commander option addition.
src/commands/run.ts	Wires visual validation by auto-detecting screenshot PNGs in `public/images/screenshots/` and passing paths to the loop. Clean integration with existing patterns.
package.json	Adds `pixelmatch`, `pngjs`, and `@types/pngjs` dependencies. All are well-known, lightweight npm packages appropriate for pixel-level image comparison.
pnpm-lock.yaml	Lock file updated for new dependencies. Auto-generated.

Sequence Diagram

sequenceDiagram
    participant Executor as Loop Executor
    participant VV as Visual Validation
    participant DS as Dev Server
    participant PW as Playwright
    participant PM as pixelmatch
    participant LLM as LLM Vision API
    participant CT as Cost Tracker

    Executor->>VV: runVisualValidation(cwd, screenshots)
    VV->>PW: ensurePlaywright()
    alt Not installed
        PW-->>PW: npm install -g playwright
    end
    VV->>DS: startDevServer(cwd)
    DS->>DS: detectDevCommand(cwd)
    DS->>DS: waitForServer(url)
    DS-->>VV: DevServerInfo {url, port, kill}
    VV->>PW: captureScreenshot(url)
    PW-->>VV: implScreenshot (PNG Buffer)
    
    loop For each design screenshot
        VV->>PM: pixelDiff(design, impl)
        PM-->>VV: {diffRatio, diffImage}
        alt diffRatio ≤ 5% (normal) or ≤ 2% (strict)
            VV-->>VV: PASS — skip LLM
        else diffRatio > threshold
            VV->>LLM: compareWithLLMVision(design, impl, diff)
            LLM-->>VV: {matches, issues, usage}
        end
    end
    
    VV->>DS: server.kill()
    VV-->>Executor: VisualValidationResult
    Executor->>CT: recordVisionCall(usage)
    
    alt Issues found & first pass
        Executor-->>Executor: Extend loop +2 iterations
        Note over Executor: Agent fixes issues
        Executor->>VV: runVisualValidation(strictMode=true)
    end

_{Last reviewed commit: c14b83e}

greptile-apps

_{11 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-26T03:53:38Z

src/loop/visual-validation.ts

+  try {
+    await execa('npm', ['install', '-g', 'playwright'], {
+      timeout: 120_000,
+      reject: true,


Global npm install -g has side effects and may fail in restricted environments

Auto-installing Playwright globally with npm install -g playwright can fail in CI, Docker containers, or systems without global npm write access. It also bypasses the project's package manager (the repo uses pnpm). Consider using the project's package manager (pnpm add -D playwright), installing locally, or documenting this requirement so users can pre-install Playwright.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/loop/visual-validation.ts Line: 125-128 Comment: **Global `npm install -g` has side effects and may fail in restricted environments** Auto-installing Playwright globally with `npm install -g playwright` can fail in CI, Docker containers, or systems without global npm write access. It also bypasses the project's package manager (the repo uses pnpm). Consider using the project's package manager (`pnpm add -D playwright`), installing locally, or documenting this requirement so users can pre-install Playwright. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-26T03:53:39Z

src/loop/dev-server.ts

+    return {
+      url,
+      port: devCmd.port,
+      process: null as unknown as ResultPromise,
+      kill: async () => {}, // Nothing to kill — not our process
+    };
+  }


Unsafe cast of null to ResultPromise for pre-existing server

When a server is already running on the detected port, process is set to null as unknown as ResultPromise. Any code that accesses server.process (e.g., checking .pid, calling methods on it) will get a runtime error. Consider making process optional (process?: ResultPromise) in the DevServerInfo interface, or using a discriminated union, so callers handle this case explicitly.

Suggested change

return {

url,

port: devCmd.port,

process: null as unknown as ResultPromise,

kill: async () => {}, // Nothing to kill — not our process

};

}

return {

url,

port: devCmd.port,

process: undefined as unknown as ResultPromise,

kill: async () => {}, // Nothing to kill — not our process

};

Prompt To Fix With AI

This is a comment left during a code review. Path: src/loop/dev-server.ts Line: 123-129 Comment: **Unsafe cast of `null` to `ResultPromise` for pre-existing server** When a server is already running on the detected port, `process` is set to `null as unknown as ResultPromise`. Any code that accesses `server.process` (e.g., checking `.pid`, calling methods on it) will get a runtime error. Consider making `process` optional (`process?: ResultPromise`) in the `DevServerInfo` interface, or using a discriminated union, so callers handle this case explicitly. ```suggestion return { url, port: devCmd.port, process: undefined as unknown as ResultPromise, kill: async () => {}, // Nothing to kill — not our process }; ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-26T03:53:42Z

Additional Comments (2)

src/loop/cost-tracker.ts
reset() doesn't clear vision tracking state

The reset() method clears this.iterations but does not reset the new _visionCalls and _visionCost fields. If reset() is called (e.g., between sessions), the vision metrics will carry stale data, producing incorrect stats.

  reset(): void {
    this.iterations = [];
    this._visionCalls = 0;
    this._visionCost = 0;
  }

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/loop/cost-tracker.ts
Line: 454-456

Comment:
**`reset()` doesn't clear vision tracking state**

The `reset()` method clears `this.iterations` but does not reset the new `_visionCalls` and `_visionCost` fields. If `reset()` is called (e.g., between sessions), the vision metrics will carry stale data, producing incorrect stats.

```suggestion
  reset(): void {
    this.iterations = [];
    this._visionCalls = 0;
    this._visionCost = 0;
  }
```

How can I resolve this? If you propose a fix, please make it concise.

src/integrations/figma/parsers/design-spec.ts
Inconsistent z-index guidance (1 vs 10)

The context-builder.ts and plan-generator.ts were updated in this PR to recommend z-index: 10 for text/content layers over background images. However, design-spec.ts still emits z-index: 1 in several places (lines 608, 670, 674, 704). This creates contradictory guidance — the context tells the agent "use z-index 10+" while the spec output says "z-index: 1". When there are multiple stacked image layers (z-0 through z-4), z-index: 1 will place text behind higher image layers, which is exactly the bug this PR aims to fix.

Lines affected:

Line 608: z-index: 1 for child content over background image
Line 670: z-index: 1 for text/content in composite hero parallax
Line 674: z-index: 1 for content in smaller composite backgrounds
Line 704: z-index: 1 for text/content in non-text-overlay composites

              `  * All child content must use \`position: relative; z-index: 10\` to appear above the image`

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/integrations/figma/parsers/design-spec.ts
Line: 608

Comment:
**Inconsistent z-index guidance (1 vs 10)**

The `context-builder.ts` and `plan-generator.ts` were updated in this PR to recommend `z-index: 10` for text/content layers over background images. However, `design-spec.ts` still emits `z-index: 1` in several places (lines 608, 670, 674, 704). This creates contradictory guidance — the context tells the agent "use z-index 10+" while the spec output says "z-index: 1". When there are multiple stacked image layers (z-0 through z-4), z-index: 1 will place text *behind* higher image layers, which is exactly the bug this PR aims to fix.

Lines affected:
- Line 608: `z-index: 1` for child content over background image
- Line 670: `z-index: 1` for text/content in composite hero parallax
- Line 674: `z-index: 1` for content in smaller composite backgrounds
- Line 704: `z-index: 1` for text/content in non-text-overlay composites

```suggestion
              `  * All child content must use \`position: relative; z-index: 10\` to appear above the image`
```

How can I resolve this? If you propose a fix, please make it concise.

github-actions bot assigned rubenmarcus Feb 26, 2026

github-actions bot added candidate-release PR is ready for release config core enhancement labels Feb 26, 2026

github-advanced-security bot found potential problems Feb 26, 2026

View reviewed changes

greptile-apps bot reviewed Feb 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: visual validation, notable components, and Figma context improvements#254

feat: visual validation, notable components, and Figma context improvements#254
rubenmarcus wants to merge 1 commit intomainfrom
feat/visual-validation-and-figma-improvements

rubenmarcus commented Feb 26, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 26, 2026

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

github-actions bot commented Feb 26, 2026

Uh oh!

Check warning

greptile-apps bot commented Feb 26, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 26, 2026

Uh oh!

greptile-apps bot Feb 26, 2026

Uh oh!

greptile-apps bot commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rubenmarcus commented Feb 26, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector bot commented Feb 26, 2026

Uh oh!

github-actions bot commented Feb 26, 2026

Issue Linking Reminder

Uh oh!

github-actions bot commented Feb 26, 2026

⚠️ Bundle Size Analysis

Uh oh!

Check warning

Uh oh!

greptile-apps bot commented Feb 26, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant