Skip to content

test: add comprehensive unit tests for surface-checker#232

Closed
nikolasdehor wants to merge 1 commit intoSynkraAI:mainfrom
nikolasdehor:test/surface-checker-coverage
Closed

test: add comprehensive unit tests for surface-checker#232
nikolasdehor wants to merge 1 commit intoSynkraAI:mainfrom
nikolasdehor:test/surface-checker-coverage

Conversation

@nikolasdehor
Copy link
Contributor

@nikolasdehor nikolasdehor commented Feb 18, 2026

Summary

Closes #231

  • Add 56 unit tests for the SurfaceChecker class
  • Tests cover all condition evaluation operators and edge cases
  • Tests verify message interpolation, surface decision logic, and validation

Test Coverage

Area Tests Key Scenarios
Constructor/Loading 6 YAML loading, missing files, parse errors
Condition Evaluation 14 >, >=, <, <=, ==, IN, OR, AND, boolean, scope, floats
Message Interpolation 6 Variables, cost formatting, missing vars, null handling
shouldSurface 8 Match/no-match, evaluation order, missing criteria
Helper Methods 10 getActionConfig, getCriteria, destructive actions
Validation 7 Valid/invalid criteria, missing fields
Factory Functions 5 createSurfaceChecker, convenience shouldSurface

Test Plan

  • All 56 tests pass with npx jest tests/core/orchestration/surface-checker.test.js
  • No impact on existing tests

Summary by CodeRabbit

  • Tests
    • Added comprehensive unit test suite with extensive coverage for core orchestration functionality, including constructor behavior, criteria loading, condition evaluation, message interpolation, and data validation across various scenarios and error conditions.

Add 56 tests covering the SurfaceChecker class that determines when
Bob should surface to ask the human for decisions.

Tests cover:
- Constructor and criteria loading from YAML
- Condition evaluation: >, >=, <, <=, ==, IN, OR, AND, boolean, scope
- Message interpolation with variable substitution and cost formatting
- shouldSurface decision logic with evaluation order priority
- Helper methods: getActionConfig, getCriteria, getDestructiveActions
- Validation of criteria file structure
- Factory functions: createSurfaceChecker, shouldSurface convenience
Copilot AI review requested due to automatic review settings February 18, 2026 19:11
@vercel
Copy link

vercel bot commented Feb 18, 2026

@nikolasdehor is attempting to deploy a commit to the Pedro Valério Lopez's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

Walkthrough

Adds a comprehensive unit test suite for the SurfaceChecker module with 638 lines of test coverage. Tests include mocked filesystem and YAML loading, constructor behavior, criteria loading scenarios, condition evaluation across multiple operators, message interpolation with variable substitution, surface decision workflow, and validation of helper methods.

Changes

Cohort / File(s) Summary
Test Suite Addition
tests/core/orchestration/surface-checker.test.js
New comprehensive unit test file covering constructor behavior, criteria loading (success/error cases), condition evaluation operators (>, >=, <, <=, ==, IN, OR, AND, boolean), message interpolation with variable substitution and number formatting, shouldSurface workflow with evaluation ordering, helper methods (getActionConfig, getCriteria, getDestructiveActions, validate), factory functions, and surface decision scenarios.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

core, tests

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding comprehensive unit tests for the surface-checker module.
Linked Issues check ✅ Passed The PR comprehensively addresses all acceptance criteria from issue #231: tests cover all condition operators, message interpolation, surface decision logic with evaluation order, criteria validation, and factory functions.
Out of Scope Changes check ✅ Passed All changes are within scope—the PR adds only unit tests for surface-checker.js as required, with no modifications to production code or unrelated files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive unit tests for the SurfaceChecker class, which determines when Bob should interrupt and ask for human decisions based on codified criteria (cost thresholds, risk levels, errors, etc.). The tests use mocked dependencies (fs, js-yaml) to isolate the unit being tested from the file system, distinguishing them from the existing integration tests in tests/core/surface-checker.test.js that load actual YAML files.

Changes:

  • Added 56 unit tests organized into 7 logical sections covering all public methods and factory functions
  • Tests verify condition evaluation operators (>, >=, <, <=, ==, IN, OR, AND), message interpolation, surface decision logic, and validation
  • Comprehensive edge case coverage including missing files, parse errors, null handling, and default values

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', {
score: 80, risk_level: 'LOW',
})).toBe(false);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AND condition test is missing a case where both conditions are false. Consider adding a test case like:

expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', {
score: 30, risk_level: 'LOW',
})).toBe(false);

This ensures comprehensive coverage of all AND logic branches (both true, one true, both false).

Suggested change
})).toBe(false);
})).toBe(false);
expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', {
score: 30, risk_level: 'LOW',
})).toBe(false);

Copilot uses AI. Check for mistakes.
const result = checker.shouldSurface({ score: 80 });

expect(result.should_surface).toBe(true);
expect(result.criterion_id).toBe('TEST-1');
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test criterion at line 370-375 doesn't include a severity field, which provides a perfect opportunity to test the default severity behavior. Consider adding an assertion to verify that result.severity equals 'info' when severity is not specified in the criterion, to ensure complete coverage of the default behavior in surface-checker.js line 262.

Suggested change
expect(result.criterion_id).toBe('TEST-1');
expect(result.criterion_id).toBe('TEST-1');
expect(result.severity).toBe('info');

Copilot uses AI. Check for mistakes.
const checker = new SurfaceChecker('/test/bad.yaml');
const result = checker.load();

expect(result).toBe(false);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test verifies that load() returns false on parse error, but doesn't verify the internal state. Consider adding assertions to check that checker._loaded remains false and checker.criteria remains null after a parse error, similar to the test at lines 108-109 for missing file.

Suggested change
expect(result).toBe(false);
expect(result).toBe(false);
expect(checker._loaded).toBe(false);
expect(checker.criteria).toBeNull();

Copilot uses AI. Check for mistakes.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/core/orchestration/surface-checker.test.js (2)

215-223: Consider adding the short-circuit false AND true → false case to the AND test.

The current tests cover (true AND true) and (true AND false). Adding (false AND true) exercises the short-circuit path explicitly.

💡 Suggested addition
     expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', {
       score: 80, risk_level: 'LOW',
     })).toBe(false);
+
+    expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', {
+      score: 30, risk_level: 'HIGH',
+    })).toBe(false);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/surface-checker.test.js` around lines 215 - 223, Add
an explicit short-circuit test for the `AND` operator by asserting that
`checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', { score: 40,
risk_level: 'HIGH' })` returns false; update the
`tests/core/orchestration/surface-checker.test.js` `test('evaluates AND
conditions correctly', ...)` block that calls `checker.evaluateCondition` so it
includes this `false AND true → false` case (use the existing `checker` instance
and the same expression string to keep consistency).

122-131: Consider adding a test that _ensureLoaded is idempotent — i.e., it does not call load() again when already loaded.

The current test only covers the "not yet loaded" path. The short-circuit guard (_loaded === true → skip) is a meaningful correctness invariant that's worth asserting.

💡 Suggested addition
+    test('_ensureLoaded does not reload when already loaded', () => {
+      const checker = new SurfaceChecker('/test/criteria.yaml');
+      checker._loaded = true;
+      checker.criteria = MOCK_CRITERIA;
+      const loadSpy = jest.spyOn(checker, 'load');
+
+      checker._ensureLoaded();
+
+      expect(loadSpy).not.toHaveBeenCalled();
+    });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/surface-checker.test.js` around lines 122 - 131, Add
an idempotency test for SurfaceChecker._ensureLoaded: create a SurfaceChecker
instance (SurfaceChecker), set its _loaded flag to true, spy/mock the load
method (load) and any fs/yaml calls as needed, call checker._ensureLoaded() and
assert that load was not called and checker._loaded remains true; this ensures
the short-circuit branch (_loaded === true) prevents re-loading.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/core/orchestration/surface-checker.test.js`:
- Around line 112-120: Rename the existing test to reflect a file read failure
(e.g., "returns false on file read error") since it currently throws from
fs.readFileSync, and add a new test for a real YAML parse error that targets
SurfaceChecker.load: mock fs.existsSync to return true, have fs.readFileSync
return a YAML string, and mock yaml.load (or the YAML parser used) to throw a
parse Error (e.g., "YAML parse error"), then assert load() returns false;
reference SurfaceChecker and its load() method to locate where the behavior is
exercised.
- Around line 360-365: Rename the test to reflect that it verifies null criteria
(e.g., "returns no-surface when criteria is null") and add a new test that sets
checker._loaded = false and mocks/stubs checker._ensureLoaded to simulate a
failed load (or let it run and assert behavior) to verify shouldSurface handles
an unloaded checker and returns should_surface false; refer to checker._loaded,
checker.criteria, checker.shouldSurface and checker._ensureLoaded when locating
where to change/extend tests.

---

Nitpick comments:
In `@tests/core/orchestration/surface-checker.test.js`:
- Around line 215-223: Add an explicit short-circuit test for the `AND` operator
by asserting that `checker.evaluateCondition('score > 50 AND risk_level ==
"HIGH"', { score: 40, risk_level: 'HIGH' })` returns false; update the
`tests/core/orchestration/surface-checker.test.js` `test('evaluates AND
conditions correctly', ...)` block that calls `checker.evaluateCondition` so it
includes this `false AND true → false` case (use the existing `checker` instance
and the same expression string to keep consistency).
- Around line 122-131: Add an idempotency test for SurfaceChecker._ensureLoaded:
create a SurfaceChecker instance (SurfaceChecker), set its _loaded flag to true,
spy/mock the load method (load) and any fs/yaml calls as needed, call
checker._ensureLoaded() and assert that load was not called and checker._loaded
remains true; this ensures the short-circuit branch (_loaded === true) prevents
re-loading.

Comment on lines +112 to +120
test('returns false on parse error', () => {
fs.existsSync.mockReturnValue(true);
fs.readFileSync.mockImplementation(() => { throw new Error('parse error'); });

const checker = new SurfaceChecker('/test/bad.yaml');
const result = checker.load();

expect(result).toBe(false);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Test name/behavior mismatch — and actual YAML parse error is not covered.

The test "returns false on parse error" throws from fs.readFileSync, which simulates a file read failure, not a YAML parse error. A genuine parse error would require yaml.load to throw (e.g., malformed YAML). Consider renaming the existing test and adding a separate case:

🛠️ Suggested additions
-    test('returns false on parse error', () => {
+    test('returns false on file read error', () => {
       fs.existsSync.mockReturnValue(true);
       fs.readFileSync.mockImplementation(() => { throw new Error('parse error'); });

       const checker = new SurfaceChecker('/test/bad.yaml');
       const result = checker.load();

       expect(result).toBe(false);
     });

+    test('returns false on YAML parse error', () => {
+      fs.existsSync.mockReturnValue(true);
+      fs.readFileSync.mockReturnValue('yaml content');
+      yaml.load.mockImplementation(() => { throw new Error('invalid yaml'); });
+
+      const checker = new SurfaceChecker('/test/bad.yaml');
+      const result = checker.load();
+
+      expect(result).toBe(false);
+    });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/surface-checker.test.js` around lines 112 - 120,
Rename the existing test to reflect a file read failure (e.g., "returns false on
file read error") since it currently throws from fs.readFileSync, and add a new
test for a real YAML parse error that targets SurfaceChecker.load: mock
fs.existsSync to return true, have fs.readFileSync return a YAML string, and
mock yaml.load (or the YAML parser used) to throw a parse Error (e.g., "YAML
parse error"), then assert load() returns false; reference SurfaceChecker and
its load() method to locate where the behavior is exercised.

Comment on lines +360 to +365
test('returns no-surface when criteria not loaded', () => {
checker.criteria = null;
const result = checker.shouldSurface({});

expect(result.should_surface).toBe(false);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Misleading test name: checker._loaded is true here — this tests null criteria, not the unloaded state.

beforeEach sets checker._loaded = true (line 314), and this test only sets checker.criteria = null. The scenario where shouldSurface triggers _ensureLoaded on a genuinely unloaded checker (and fails to load) is not covered.

🛠️ Suggested fix
-    test('returns no-surface when criteria not loaded', () => {
-      checker.criteria = null;
-      const result = checker.shouldSurface({});
-
-      expect(result.should_surface).toBe(false);
-    });
+    test('returns no-surface when criteria is null', () => {
+      checker.criteria = null;
+      const result = checker.shouldSurface({});
+
+      expect(result.should_surface).toBe(false);
+    });
+
+    test('returns no-surface when criteria file cannot be loaded', () => {
+      fs.existsSync.mockReturnValue(false);
+      const unloadedChecker = new SurfaceChecker('/missing/path.yaml');
+      const result = unloadedChecker.shouldSurface({ estimated_cost: 25 });
+
+      expect(result.should_surface).toBe(false);
+    });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/surface-checker.test.js` around lines 360 - 365,
Rename the test to reflect that it verifies null criteria (e.g., "returns
no-surface when criteria is null") and add a new test that sets checker._loaded
= false and mocks/stubs checker._ensureLoaded to simulate a failed load (or let
it run and assert behavior) to verify shouldSurface handles an unloaded checker
and returns should_surface false; refer to checker._loaded, checker.criteria,
checker.shouldSurface and checker._ensureLoaded when locating where to
change/extend tests.

@nikolasdehor
Copy link
Contributor Author

Consolidated into #426

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test: add unit tests for surface-checker module

2 participants