Skip to content

test: add unit tests for context-manager module#242

Closed
nikolasdehor wants to merge 1 commit intoSynkraAI:mainfrom
nikolasdehor:test/context-manager-coverage
Closed

test: add unit tests for context-manager module#242
nikolasdehor wants to merge 1 commit intoSynkraAI:mainfrom
nikolasdehor:test/context-manager-coverage

Conversation

@nikolasdehor
Copy link
Contributor

@nikolasdehor nikolasdehor commented Feb 18, 2026

Summary

Closes #241

Add 51 unit tests for the ContextManager class in .aios-core/core/orchestration/context-manager.js (615 lines, previously 0% coverage).

Test Coverage

Area Tests Key Scenarios
Constructor 1 Path setup, cache init
ensureStateDir 1 Directory creation
initialize 2 New state, existing state
loadState 3 Cache hit, file load, fallback
savePhaseOutput 3 State update, handoff file, target info
getContextForPhase 3 Previous phases, phase 1, handoffs
Phase getters 4 Outputs, null, cache
Status marking 2 Completed, failed
updateMetadata 1 Merge semantics
Phase completion 5 Last completed, isCompleted
getSummary 2 Cached, no cache
Delivery confidence 2 Null, from metadata
Handoff packages 5 Structure, decisions, evidence, dedup, risks
Confidence calc 13 All 5 components, threshold, env override, regression
Reset 2 Keep/discard metadata
Export/Import 2 Copy semantics, persistence

Testing Approach

  • Mocks fs-extra for isolation
  • Tests the full delivery confidence formula with all 5 weighted components
  • Validates handoff package extraction from multiple output formats
  • Covers state lifecycle: initialize → save phases → mark complete/fail → reset

All 51 tests passing.

Summary by CodeRabbit

  • Tests
    • Added comprehensive unit tests for the ContextManager, improving code quality and reliability across multiple scenarios including state management, phase operations, and delivery confidence calculations.

Add 51 tests covering ContextManager class:
- State initialization, loading, and caching
- Phase output saving with handoff packages
- Context assembly for subsequent phases
- Workflow status marking (completed, failed)
- Phase completion tracking and metadata updates
- Handoff package building with decision logs, evidence, risks
- Delivery confidence calculation with all 5 weighted components
- State reset, export, and import
Copilot AI review requested due to automatic review settings February 18, 2026 19:23
@vercel
Copy link

vercel bot commented Feb 18, 2026

@nikolasdehor is attempting to deploy a commit to the Pedro Valério Lopez's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

Walkthrough

Adds 655 lines of comprehensive unit tests for the ContextManager class, covering state management, phase tracking, handoff package assembly, delivery confidence calculations, and metadata operations through 51 tests across 14 describe blocks using fs-extra mocks.

Changes

Cohort / File(s) Summary
Test Suite for Context Manager
tests/core/orchestration/context-manager.test.js
New test file with 51 tests covering constructor initialization, state lifecycle (initialize/loadState/savePhaseOutput), phase context assembly, status marking, metadata management, handoff package structure and risk aggregation, delivery confidence scoring with threshold overrides, state reset semantics, and export/import operations with caching validation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

core, tests

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test: add unit tests for context-manager module' accurately describes the main change - adding comprehensive unit tests for the ContextManager class.
Linked Issues check ✅ Passed All coding requirements from issue #241 are met: 51 tests cover constructor, state lifecycle, phase outputs, context assembly, phase tracking, status marking, metadata handling, handoff packages, delivery confidence (all 5 components), reset, and export/import with fs-extra mocking.
Out of Scope Changes check ✅ Passed All changes are scoped to the test file and directly address requirements in issue #241; no out-of-scope modifications to production code or unrelated functionality detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive unit tests for the ContextManager class, which manages workflow state persistence, phase outputs, handoff packages, and delivery confidence scoring in the AIOS orchestration system. The tests provide 100% coverage for a previously untested 615-line module.

Changes:

  • Added 51 unit tests organized into 14 describe blocks covering all public methods and internal logic
  • Used jest.mock('fs-extra') for isolated unit testing, complementing the existing integration tests in context-manager-handoff.test.js
  • Comprehensive coverage of the delivery confidence calculation with all 5 weighted components and edge cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +470 to +477
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85';

await mgr.savePhaseOutput(1, { result: {} });

const confidence = mgr.getDeliveryConfidence();
expect(confidence.threshold).toBe(85);

delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variable handling should save and restore the original value to prevent test pollution. The current approach only deletes the variable after the test, which could cause issues if the variable was already set before the test. Consider using the pattern seen elsewhere in the codebase: save the original value, set the test value, then restore the original value (or delete if it didn't exist).

Suggested change
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85';
await mgr.savePhaseOutput(1, { result: {} });
const confidence = mgr.getDeliveryConfidence();
expect(confidence.threshold).toBe(85);
delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
const originalThreshold = process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
try {
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85';
await mgr.savePhaseOutput(1, { result: {} });
const confidence = mgr.getDeliveryConfidence();
expect(confidence.threshold).toBe(85);
} finally {
if (originalThreshold === undefined) {
delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
} else {
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = originalThreshold;
}
}

Copilot uses AI. Check for mistakes.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tests/core/orchestration/context-manager.test.js (3)

118-124: Redundant mock re-setup after jest.clearAllMocks().

jest.clearAllMocks() and jest.resetAllMocks() serve distinct purposes: the former clears usage data without touching implementations, while the latter resets everything including implementations. Since jest.clearAllMocks() at line 121 preserves the mockResolvedValue implementations already set by the outer beforeEach, lines 122–123 are redundant. The same redundancy appears in the getContextForPhase inner beforeEach (lines 171–172), where there is not even a preceding clearAllMocks() call.

♻️ Proposed cleanup
 beforeEach(async () => {
   fs.pathExists.mockResolvedValue(false);
   await mgr.initialize();
   jest.clearAllMocks();
-  fs.ensureDir.mockResolvedValue(undefined);
-  fs.writeJson.mockResolvedValue(undefined);
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/context-manager.test.js` around lines 118 - 124, The
tests reconfigure fs mocks redundantly after calling jest.clearAllMocks() (and
in one case without any clear/reset), so remove the extra mockResolvedValue
calls (fs.ensureDir.mockResolvedValue and fs.writeJson.mockResolvedValue) that
appear after jest.clearAllMocks() in the beforeEach and in the
getContextForPhase inner beforeEach; instead set all required mock
implementations once in the outer beforeEach (or use jest.resetAllMocks() if you
intend to wipe implementations) so the tests rely on the single setup in the
beforeEach and avoid duplicated mock setup around the beforeEach and
getContextForPhase hooks.

54-112: No error-path tests for async fs operations.

None of the 51 tests simulate a rejection from fs.readJson, fs.writeJson, or fs.ensureDir. If the source module catches and handles these errors (e.g., corrupt state file, disk full), that logic has 0% coverage. At minimum, covering initialize and savePhaseOutput rejection scenarios would verify the error handling path.

💡 Example skeleton
test('initialize handles readJson failure gracefully', async () => {
  fs.pathExists.mockResolvedValue(true);
  fs.readJson.mockRejectedValue(new Error('ENOENT'));

  // Expect either a thrown error or a fallback to initial state,
  // depending on the implementation contract.
  await expect(mgr.initialize()).rejects.toThrow(); // or resolves gracefully
});

As per coding guidelines, error handling should be comprehensively verified in .js files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/context-manager.test.js` around lines 54 - 112, Add
tests that simulate async fs failures for initialize and savePhaseOutput by
mocking fs.readJson, fs.writeJson, and fs.ensureDir to reject; for initialize,
add a case where fs.pathExists resolves true and fs.readJson rejects and assert
the module either rejects (await expect(mgr.initialize()).rejects.toThrow()) or
returns the fallback initial state per the implementation; for savePhaseOutput,
mock fs.ensureDir/writeJson to reject and assert the method surface handles the
error as intended (rejects or logs/returns a specific value); reference the
existing test-suite helpers and functions initialize and savePhaseOutput and the
mocked fs methods (fs.readJson, fs.writeJson, fs.ensureDir) to locate where to
add these tests.

44-48: Prefer path-level assertions over bare call-count checks.

toHaveBeenCalledTimes(3) at line 47 (and the analogous toHaveBeenCalledTimes(3) at line 149) will silently break if a directory is added or removed, and the error message won't indicate which directory is missing. Asserting on the actual paths is both more resilient to refactors and more self-documenting.

♻️ Suggested approach
-expect(fs.ensureDir).toHaveBeenCalledTimes(3);
+expect(fs.ensureDir).toHaveBeenCalledWith(expect.stringContaining('.aios/workflow-state'));
+expect(fs.ensureDir).toHaveBeenCalledWith(expect.stringContaining('handoffs'));
+expect(fs.ensureDir).toHaveBeenCalledWith(expect.stringContaining('confidence'));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/context-manager.test.js` around lines 44 - 48, The
test for ensureStateDir uses a brittle toHaveBeenCalledTimes(3); update the
assertions to verify the specific directory paths passed to fs.ensureDir so
failures show which path is missing or incorrect: locate the test block under
describe('ensureStateDir') where mgr.ensureStateDir() is invoked and replace the
count assertion with explicit assertions against fs.ensureDir calls (e.g.,
toHaveBeenCalledWith or toHaveBeenNthCalledWith for the known state directories
used by ensureStateDir) ensuring you reference the same mock fs.ensureDir and
the mgr.ensureStateDir invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/core/orchestration/context-manager.test.js`:
- Around line 469-478: The test mutates
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD without guaranteeing cleanup;
wrap the assignment and the calls to mgr.savePhaseOutput and
mgr.getDeliveryConfidence in a try/finally so you always delete or restore the
original env value in the finally block; update the test that sets
AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85' to capture any previous value (e.g.,
const prev = process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD) before changing it
and then in finally restore process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD =
prev (or delete if prev is undefined) to prevent cross-test pollution involving
mgr.savePhaseOutput and mgr.getDeliveryConfidence.

---

Nitpick comments:
In `@tests/core/orchestration/context-manager.test.js`:
- Around line 118-124: The tests reconfigure fs mocks redundantly after calling
jest.clearAllMocks() (and in one case without any clear/reset), so remove the
extra mockResolvedValue calls (fs.ensureDir.mockResolvedValue and
fs.writeJson.mockResolvedValue) that appear after jest.clearAllMocks() in the
beforeEach and in the getContextForPhase inner beforeEach; instead set all
required mock implementations once in the outer beforeEach (or use
jest.resetAllMocks() if you intend to wipe implementations) so the tests rely on
the single setup in the beforeEach and avoid duplicated mock setup around the
beforeEach and getContextForPhase hooks.
- Around line 54-112: Add tests that simulate async fs failures for initialize
and savePhaseOutput by mocking fs.readJson, fs.writeJson, and fs.ensureDir to
reject; for initialize, add a case where fs.pathExists resolves true and
fs.readJson rejects and assert the module either rejects (await
expect(mgr.initialize()).rejects.toThrow()) or returns the fallback initial
state per the implementation; for savePhaseOutput, mock fs.ensureDir/writeJson
to reject and assert the method surface handles the error as intended (rejects
or logs/returns a specific value); reference the existing test-suite helpers and
functions initialize and savePhaseOutput and the mocked fs methods (fs.readJson,
fs.writeJson, fs.ensureDir) to locate where to add these tests.
- Around line 44-48: The test for ensureStateDir uses a brittle
toHaveBeenCalledTimes(3); update the assertions to verify the specific directory
paths passed to fs.ensureDir so failures show which path is missing or
incorrect: locate the test block under describe('ensureStateDir') where
mgr.ensureStateDir() is invoked and replace the count assertion with explicit
assertions against fs.ensureDir calls (e.g., toHaveBeenCalledWith or
toHaveBeenNthCalledWith for the known state directories used by ensureStateDir)
ensuring you reference the same mock fs.ensureDir and the mgr.ensureStateDir
invocation.

Comment on lines +469 to +478
test('uses custom threshold from env', async () => {
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85';

await mgr.savePhaseOutput(1, { result: {} });

const confidence = mgr.getDeliveryConfidence();
expect(confidence.threshold).toBe(85);

delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

process.env mutation without failure-safe cleanup causes test pollution.

If mgr.savePhaseOutput or mgr.getDeliveryConfidence throws before line 477, AIOS_DELIVERY_CONFIDENCE_THRESHOLD stays set at '85' for the immediately following 'defaults to threshold 70' test (line 480), causing it to fail with 85 !== 70 — a cascading false positive that masks real issues.

🔒 Proposed fix: guard with try/finally
 test('uses custom threshold from env', async () => {
   process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85';
-
-  await mgr.savePhaseOutput(1, { result: {} });
-
-  const confidence = mgr.getDeliveryConfidence();
-  expect(confidence.threshold).toBe(85);
-
-  delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
+  try {
+    await mgr.savePhaseOutput(1, { result: {} });
+    const confidence = mgr.getDeliveryConfidence();
+    expect(confidence.threshold).toBe(85);
+  } finally {
+    delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
+  }
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
test('uses custom threshold from env', async () => {
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85';
await mgr.savePhaseOutput(1, { result: {} });
const confidence = mgr.getDeliveryConfidence();
expect(confidence.threshold).toBe(85);
delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
});
test('uses custom threshold from env', async () => {
process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85';
try {
await mgr.savePhaseOutput(1, { result: {} });
const confidence = mgr.getDeliveryConfidence();
expect(confidence.threshold).toBe(85);
} finally {
delete process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD;
}
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/core/orchestration/context-manager.test.js` around lines 469 - 478, The
test mutates process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD without guaranteeing
cleanup; wrap the assignment and the calls to mgr.savePhaseOutput and
mgr.getDeliveryConfidence in a try/finally so you always delete or restore the
original env value in the finally block; update the test that sets
AIOS_DELIVERY_CONFIDENCE_THRESHOLD = '85' to capture any previous value (e.g.,
const prev = process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD) before changing it
and then in finally restore process.env.AIOS_DELIVERY_CONFIDENCE_THRESHOLD =
prev (or delete if prev is undefined) to prevent cross-test pollution involving
mgr.savePhaseOutput and mgr.getDeliveryConfidence.

@nikolasdehor
Copy link
Contributor Author

Consolidated into #426

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add unit tests for context-manager module

2 participants