test: add comprehensive unit tests for surface-checker#232
test: add comprehensive unit tests for surface-checker#232nikolasdehor wants to merge 1 commit intoSynkraAI:mainfrom
Conversation
Add 56 tests covering the SurfaceChecker class that determines when Bob should surface to ask the human for decisions. Tests cover: - Constructor and criteria loading from YAML - Condition evaluation: >, >=, <, <=, ==, IN, OR, AND, boolean, scope - Message interpolation with variable substitution and cost formatting - shouldSurface decision logic with evaluation order priority - Helper methods: getActionConfig, getCriteria, getDestructiveActions - Validation of criteria file structure - Factory functions: createSurfaceChecker, shouldSurface convenience
|
@nikolasdehor is attempting to deploy a commit to the Pedro Valério Lopez's projects Team on Vercel. A member of the Team first needs to authorize it. |
WalkthroughAdds a comprehensive unit test suite for the SurfaceChecker module with 638 lines of test coverage. Tests include mocked filesystem and YAML loading, constructor behavior, criteria loading scenarios, condition evaluation across multiple operators, message interpolation with variable substitution, surface decision workflow, and validation of helper methods. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive unit tests for the SurfaceChecker class, which determines when Bob should interrupt and ask for human decisions based on codified criteria (cost thresholds, risk levels, errors, etc.). The tests use mocked dependencies (fs, js-yaml) to isolate the unit being tested from the file system, distinguishing them from the existing integration tests in tests/core/surface-checker.test.js that load actual YAML files.
Changes:
- Added 56 unit tests organized into 7 logical sections covering all public methods and factory functions
- Tests verify condition evaluation operators (>, >=, <, <=, ==, IN, OR, AND), message interpolation, surface decision logic, and validation
- Comprehensive edge case coverage including missing files, parse errors, null handling, and default values
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', { | ||
| score: 80, risk_level: 'LOW', | ||
| })).toBe(false); |
There was a problem hiding this comment.
The AND condition test is missing a case where both conditions are false. Consider adding a test case like:
expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', {
score: 30, risk_level: 'LOW',
})).toBe(false);
This ensures comprehensive coverage of all AND logic branches (both true, one true, both false).
| })).toBe(false); | |
| })).toBe(false); | |
| expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', { | |
| score: 30, risk_level: 'LOW', | |
| })).toBe(false); |
| const result = checker.shouldSurface({ score: 80 }); | ||
|
|
||
| expect(result.should_surface).toBe(true); | ||
| expect(result.criterion_id).toBe('TEST-1'); |
There was a problem hiding this comment.
The test criterion at line 370-375 doesn't include a severity field, which provides a perfect opportunity to test the default severity behavior. Consider adding an assertion to verify that result.severity equals 'info' when severity is not specified in the criterion, to ensure complete coverage of the default behavior in surface-checker.js line 262.
| expect(result.criterion_id).toBe('TEST-1'); | |
| expect(result.criterion_id).toBe('TEST-1'); | |
| expect(result.severity).toBe('info'); |
| const checker = new SurfaceChecker('/test/bad.yaml'); | ||
| const result = checker.load(); | ||
|
|
||
| expect(result).toBe(false); |
There was a problem hiding this comment.
The test verifies that load() returns false on parse error, but doesn't verify the internal state. Consider adding assertions to check that checker._loaded remains false and checker.criteria remains null after a parse error, similar to the test at lines 108-109 for missing file.
| expect(result).toBe(false); | |
| expect(result).toBe(false); | |
| expect(checker._loaded).toBe(false); | |
| expect(checker.criteria).toBeNull(); |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tests/core/orchestration/surface-checker.test.js (2)
215-223: Consider adding the short-circuitfalse AND true → falsecase to the AND test.The current tests cover
(true AND true)and(true AND false). Adding(false AND true)exercises the short-circuit path explicitly.💡 Suggested addition
expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', { score: 80, risk_level: 'LOW', })).toBe(false); + + expect(checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', { + score: 30, risk_level: 'HIGH', + })).toBe(false);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/core/orchestration/surface-checker.test.js` around lines 215 - 223, Add an explicit short-circuit test for the `AND` operator by asserting that `checker.evaluateCondition('score > 50 AND risk_level == "HIGH"', { score: 40, risk_level: 'HIGH' })` returns false; update the `tests/core/orchestration/surface-checker.test.js` `test('evaluates AND conditions correctly', ...)` block that calls `checker.evaluateCondition` so it includes this `false AND true → false` case (use the existing `checker` instance and the same expression string to keep consistency).
122-131: Consider adding a test that_ensureLoadedis idempotent — i.e., it does not callload()again when already loaded.The current test only covers the "not yet loaded" path. The short-circuit guard (
_loaded === true→ skip) is a meaningful correctness invariant that's worth asserting.💡 Suggested addition
+ test('_ensureLoaded does not reload when already loaded', () => { + const checker = new SurfaceChecker('/test/criteria.yaml'); + checker._loaded = true; + checker.criteria = MOCK_CRITERIA; + const loadSpy = jest.spyOn(checker, 'load'); + + checker._ensureLoaded(); + + expect(loadSpy).not.toHaveBeenCalled(); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/core/orchestration/surface-checker.test.js` around lines 122 - 131, Add an idempotency test for SurfaceChecker._ensureLoaded: create a SurfaceChecker instance (SurfaceChecker), set its _loaded flag to true, spy/mock the load method (load) and any fs/yaml calls as needed, call checker._ensureLoaded() and assert that load was not called and checker._loaded remains true; this ensures the short-circuit branch (_loaded === true) prevents re-loading.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/core/orchestration/surface-checker.test.js`:
- Around line 112-120: Rename the existing test to reflect a file read failure
(e.g., "returns false on file read error") since it currently throws from
fs.readFileSync, and add a new test for a real YAML parse error that targets
SurfaceChecker.load: mock fs.existsSync to return true, have fs.readFileSync
return a YAML string, and mock yaml.load (or the YAML parser used) to throw a
parse Error (e.g., "YAML parse error"), then assert load() returns false;
reference SurfaceChecker and its load() method to locate where the behavior is
exercised.
- Around line 360-365: Rename the test to reflect that it verifies null criteria
(e.g., "returns no-surface when criteria is null") and add a new test that sets
checker._loaded = false and mocks/stubs checker._ensureLoaded to simulate a
failed load (or let it run and assert behavior) to verify shouldSurface handles
an unloaded checker and returns should_surface false; refer to checker._loaded,
checker.criteria, checker.shouldSurface and checker._ensureLoaded when locating
where to change/extend tests.
---
Nitpick comments:
In `@tests/core/orchestration/surface-checker.test.js`:
- Around line 215-223: Add an explicit short-circuit test for the `AND` operator
by asserting that `checker.evaluateCondition('score > 50 AND risk_level ==
"HIGH"', { score: 40, risk_level: 'HIGH' })` returns false; update the
`tests/core/orchestration/surface-checker.test.js` `test('evaluates AND
conditions correctly', ...)` block that calls `checker.evaluateCondition` so it
includes this `false AND true → false` case (use the existing `checker` instance
and the same expression string to keep consistency).
- Around line 122-131: Add an idempotency test for SurfaceChecker._ensureLoaded:
create a SurfaceChecker instance (SurfaceChecker), set its _loaded flag to true,
spy/mock the load method (load) and any fs/yaml calls as needed, call
checker._ensureLoaded() and assert that load was not called and checker._loaded
remains true; this ensures the short-circuit branch (_loaded === true) prevents
re-loading.
| test('returns false on parse error', () => { | ||
| fs.existsSync.mockReturnValue(true); | ||
| fs.readFileSync.mockImplementation(() => { throw new Error('parse error'); }); | ||
|
|
||
| const checker = new SurfaceChecker('/test/bad.yaml'); | ||
| const result = checker.load(); | ||
|
|
||
| expect(result).toBe(false); | ||
| }); |
There was a problem hiding this comment.
Test name/behavior mismatch — and actual YAML parse error is not covered.
The test "returns false on parse error" throws from fs.readFileSync, which simulates a file read failure, not a YAML parse error. A genuine parse error would require yaml.load to throw (e.g., malformed YAML). Consider renaming the existing test and adding a separate case:
🛠️ Suggested additions
- test('returns false on parse error', () => {
+ test('returns false on file read error', () => {
fs.existsSync.mockReturnValue(true);
fs.readFileSync.mockImplementation(() => { throw new Error('parse error'); });
const checker = new SurfaceChecker('/test/bad.yaml');
const result = checker.load();
expect(result).toBe(false);
});
+ test('returns false on YAML parse error', () => {
+ fs.existsSync.mockReturnValue(true);
+ fs.readFileSync.mockReturnValue('yaml content');
+ yaml.load.mockImplementation(() => { throw new Error('invalid yaml'); });
+
+ const checker = new SurfaceChecker('/test/bad.yaml');
+ const result = checker.load();
+
+ expect(result).toBe(false);
+ });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/core/orchestration/surface-checker.test.js` around lines 112 - 120,
Rename the existing test to reflect a file read failure (e.g., "returns false on
file read error") since it currently throws from fs.readFileSync, and add a new
test for a real YAML parse error that targets SurfaceChecker.load: mock
fs.existsSync to return true, have fs.readFileSync return a YAML string, and
mock yaml.load (or the YAML parser used) to throw a parse Error (e.g., "YAML
parse error"), then assert load() returns false; reference SurfaceChecker and
its load() method to locate where the behavior is exercised.
| test('returns no-surface when criteria not loaded', () => { | ||
| checker.criteria = null; | ||
| const result = checker.shouldSurface({}); | ||
|
|
||
| expect(result.should_surface).toBe(false); | ||
| }); |
There was a problem hiding this comment.
Misleading test name: checker._loaded is true here — this tests null criteria, not the unloaded state.
beforeEach sets checker._loaded = true (line 314), and this test only sets checker.criteria = null. The scenario where shouldSurface triggers _ensureLoaded on a genuinely unloaded checker (and fails to load) is not covered.
🛠️ Suggested fix
- test('returns no-surface when criteria not loaded', () => {
- checker.criteria = null;
- const result = checker.shouldSurface({});
-
- expect(result.should_surface).toBe(false);
- });
+ test('returns no-surface when criteria is null', () => {
+ checker.criteria = null;
+ const result = checker.shouldSurface({});
+
+ expect(result.should_surface).toBe(false);
+ });
+
+ test('returns no-surface when criteria file cannot be loaded', () => {
+ fs.existsSync.mockReturnValue(false);
+ const unloadedChecker = new SurfaceChecker('/missing/path.yaml');
+ const result = unloadedChecker.shouldSurface({ estimated_cost: 25 });
+
+ expect(result.should_surface).toBe(false);
+ });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/core/orchestration/surface-checker.test.js` around lines 360 - 365,
Rename the test to reflect that it verifies null criteria (e.g., "returns
no-surface when criteria is null") and add a new test that sets checker._loaded
= false and mocks/stubs checker._ensureLoaded to simulate a failed load (or let
it run and assert behavior) to verify shouldSurface handles an unloaded checker
and returns should_surface false; refer to checker._loaded, checker.criteria,
checker.shouldSurface and checker._ensureLoaded when locating where to
change/extend tests.
|
Consolidated into #426 |
Summary
Closes #231
Test Coverage
Test Plan
npx jest tests/core/orchestration/surface-checker.test.jsSummary by CodeRabbit