refactor: extract CompactionContext/ValidationResult to compaction_context.py (issue #2845) by rysweet · Pull Request #2867 · rysweet/amplihack

rysweet · 2026-03-04T03:53:12Z

Summary

Extract _parse_timestamp_age, CompactionContext, and ValidationResult dataclasses from compaction_validator.py into new compaction_context.py module — reduces compaction_validator.py from 523 → 401 lines
Replace 3 silent pass blocks (lines 208-212, 243-245, 252-254) with logger.warning() calls so errors are surfaced instead of swallowed
Remove unnecessary object.__setattr__ on non-frozen dataclass (dataclass is mutable; direct attribute assignment works fine)
Remove redundant _parse_timestamp_age recomputation in validate() — __post_init__ already handles it
Add __all__ export in compaction_validator.py for backward compatibility

Step 13: Local Testing Results

Scenario 1 — Import contract: dataclass extraction to compaction_context.py

Command: python -m pytest .claude/tools/amplihack/hooks/tests/outside-in/test_compaction_refactor_outside_in.py::TestImportContract -v
Result: PASS
Output:

test_compaction_context_importable_from_new_module PASSED
test_validation_result_importable_from_new_module PASSED
test_compaction_context_backward_compat_from_validator PASSED
test_validation_result_backward_compat_from_validator PASSED
test_post_init_computes_age_no_object_setattr PASSED
test_parse_timestamp_age_importable PASSED
6 passed in 0.26s

Scenario 2 — Error surfacing: silent pass → logger.warning + fail-open regression

Command: python -m pytest .claude/tools/amplihack/hooks/tests/outside-in/test_compaction_refactor_outside_in.py::TestErrorSurfacing -v
Result: PASS
Output:

test_missing_transcript_emits_warning_not_silent PASSED
test_sort_type_error_emits_warning PASSED
test_validate_no_compaction_returns_passed PASSED
test_compaction_context_get_diagnostic_summary PASSED
4 passed in 0.21s

Full compaction test suite: 62/62 passing (unit) + 10/10 passing (outside-in).

Test plan

python -m pytest .claude/tools/amplihack/hooks/tests/outside-in/test_compaction_refactor_outside_in.py -v — 10 outside-in tests
python -m pytest .claude/tools/amplihack/hooks/tests/test_compaction_context.py tests/test_compaction_refactor.py tests/test_compaction_validator.py -v — 62 unit tests
Verify from compaction_validator import CompactionContext, ValidationResult still works (backward compat)
Verify from compaction_context import CompactionContext, ValidationResult works (new module)

🤖 Generated with Claude Code

github-actions · 2026-03-04T03:53:54Z

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

github-actions · 2026-03-04T03:57:37Z

Repo Guardian - Passed

All 11 changed files in this PR are durable content appropriate for the repository:

Production Code:

.claude/tools/amplihack/hooks/compaction_context.py - Extracted dataclasses module (128 lines)
.claude/tools/amplihack/hooks/compaction_validator.py - Refactored validator (401 lines, down from 523)
src/amplihack/recipes/models.py, parser.py, runner.py - Recipe runner implementation

Tests:

.claude/tools/amplihack/hooks/tests/outside-in/test_compaction_refactor_outside_in.py - Outside-in behavior tests
.claude/tools/amplihack/hooks/tests/test_compaction_context.py - TDD tests for new module
.claude/tools/amplihack/hooks/tests/test_compaction_refactor.py - Structural constraint tests
tests/unit/recipes/test_recipe_step_type.py - Recipe runner unit tests

Documentation:

docs/recipes/README.md - Recipe runner documentation (added type: recipe step documentation)

Configuration:

pyproject.toml - Version bump (0.5.20 → 0.5.21)

No point-in-time documents, meeting notes, or temporary scripts detected. ✅

AI generated by Repo Guardian

rysweet

Code Review — compaction_validator refactor (issue #2845)

All 62 compaction-specific tests pass. The refactor achieves its structural goals. Two issues require attention before merge.

🔴 BUG: `resolved_path` may be unbound — breaks fail-open contract

File: compaction_validator.py lines 122–139

try:
    resolved_path = transcript_path.resolve()       # (1) may raise
    if not resolved_path.is_relative_to(...):
        ...
        return context
except (ValueError, OSError) as exc:
    logger.warning(...)                             # (2) caught, continues

# Load transcript
try:
    with open(resolved_path) as f:                  # (3) NameError if (1) raised!
        context.pre_compaction_transcript = json.load(f)
except (FileNotFoundError, json.JSONDecodeError, OSError) as exc:
    ...                                             # NameError NOT caught here

If transcript_path.resolve() raises ValueError or OSError, resolved_path is never bound. Execution continues to line 134 where open(resolved_path) raises NameError, which is not caught by the second except clause. This propagates as an unhandled exception from load_compaction_context, violating the fail-open contract documented in the module docstring.

Fix: Add return context after the logger.warning in the path-resolution except block (consistent with what happens on security violation), or use transcript_path as the fallback.

🟡 P1 requirement not applied: magic number `24` still present

File: compaction_context.py line 57

is_stale = age_hours > 24   # magic number — _STALENESS_THRESHOLD_HOURS not added

The implementation spec listed this as P1: "Added _STALENESS_THRESHOLD_HOURS = 24 module constant; age_hours > 24 replaced with age_hours > _STALENESS_THRESHOLD_HOURS". The constant was not added. No test enforces it (which is why the suite passes), but the requirements document it as required.

🟡 Docstring inaccuracy in `_parse_timestamp_age`

File: compaction_context.py lines 18–19

- Rejects future timestamps (negative age) to 0.0 (clock-skew tolerance)

The code clamps (sets to 0.0), it does not reject. Returning (0.0, False) is semantically different from returning early — clamped timestamps continue to compute is_stale. The docstring also references "5-minute tolerance" which does not exist in the implementation; all future timestamps clamp to zero regardless of delta.

🟡 `validate_recent_context`: set-based content matching has false-positive risk

File: compaction_validator.py lines 360, 379–380

post_content = {msg.get("content", "") for msg in post_compaction}
...
preserved_count = sum(1 for content in check_content if content in post_content)

When content is a list (multi-part tool-result or assistant messages), msg.get("content", "") returns "" for all such messages. Every empty string matches every empty string, so all multi-part messages appear "preserved" regardless of their actual content. Transcripts with tool calls will inflate the preservation rate artificially.

✅ Confirmed correct

TOCTOU fix: open(resolved_path) used after security check (when resolved_path is defined)
object.__setattr__ fully removed — plain self.age_hours = ... is correct for non-frozen dataclass
Silent pass blocks replaced with logger.warning() calls — 3 logger.warning calls confirmed
__all__ backward-compat re-exports in place
Redundant _parse_timestamp_age recomputation in validate() removed
isinstance(events_data, list) type guard in place
isinstance(e, dict) filter in session events comprehension in place
Line count: 404 lines (≤ 420 ✅)

Summary: One real bug (unbound resolved_path) needs fixing before merge. The P1 magic-number issue and docstring inaccuracy are minor but should be addressed for code quality.

rysweet

Code Review — compaction_validator refactor (issue #2845)

All 62 compaction-specific tests pass. The refactor achieves its structural goals. Two issues require attention before merge.

BUG: resolved_path may be unbound — breaks fail-open contract

File: compaction_validator.py lines 122-139

If transcript_path.resolve() raises ValueError or OSError, resolved_path is never bound. Execution falls through to open(resolved_path) on line 134, which raises NameError. NameError is NOT caught by the second except clause (FileNotFoundError, json.JSONDecodeError, OSError). This propagates as an unhandled exception from load_compaction_context, violating the fail-open contract in the module docstring.

Fix: Add return context after the logger.warning in the path-resolution except block (consistent with the security-violation path), or fall back to transcript_path.

P1 requirement not applied: magic number 24 still present

File: compaction_context.py line 57

is_stale = age_hours > 24   # _STALENESS_THRESHOLD_HOURS constant not added

The implementation spec listed this as P1: "Added _STALENESS_THRESHOLD_HOURS = 24 module constant". The constant was not added. No test enforces it (suite passes), but the requirements document it as required.

Docstring inaccuracy in _parse_timestamp_age

File: compaction_context.py lines 18-19

The docstring says "Rejects future timestamps (negative age) to 0.0" but the code CLAMPS (returns 0.0, False instead of returning early). Also references "5-minute tolerance" which does not exist in the implementation — all future timestamps clamp to zero regardless of delta.

validate_recent_context: set-based content matching has false-positive risk

File: compaction_validator.py lines 360, 379-380

When content is a list (multi-part tool-result or assistant messages), msg.get("content", "") returns "" for all such messages. Every empty string matches every empty string in post_content, so multi-part messages always appear "preserved" regardless of actual content. Transcripts heavy with tool calls will inflate the preservation rate artificially.

Confirmed correct

TOCTOU fix: open(resolved_path) used after security check (when defined)
object.setattr fully removed — plain self.age_hours = ... is correct for non-frozen dataclass
Silent pass blocks replaced with logger.warning() calls — 3 calls confirmed
all backward-compat re-exports in place
Redundant _parse_timestamp_age recomputation in validate() removed
isinstance(events_data, list) type guard added
isinstance(e, dict) filter in session events comprehension added
Line count: 404 lines (<=420)

Summary: The unbound resolved_path bug needs fixing before merge. P1 magic-number and docstring issues are minor but should be addressed for completeness.

rysweet · 2026-03-04T04:04:14Z

Code Review — compaction_validator refactor (issue #2845)

All 62 compaction-specific tests pass. The refactor achieves its structural goals. Two issues require attention before merge.

BUG: `resolved_path` may be unbound — breaks fail-open contract

File: compaction_validator.py lines 122–139

If transcript_path.resolve() raises ValueError or OSError, resolved_path is never bound. Execution falls through to open(resolved_path) on line 134, which raises NameError. NameError is NOT caught by the second except clause (FileNotFoundError, json.JSONDecodeError, OSError). This propagates as an unhandled exception from load_compaction_context, violating the fail-open contract in the module docstring.

Fix: Add return context after the logger.warning in the path-resolution except block (consistent with the security-violation path).

P1 requirement not applied: magic number `24` still present

File: compaction_context.py line 57

is_stale = age_hours > 24   # _STALENESS_THRESHOLD_HOURS constant not added

The implementation spec listed this as P1: add _STALENESS_THRESHOLD_HOURS = 24 module constant. The constant was not added. No test enforces it, but the requirements document it as required.

Docstring inaccuracy in `_parse_timestamp_age`

File: compaction_context.py lines 18–19

The docstring says Rejects future timestamps (negative age) to 0.0 but the code clamps (returns (0.0, False) instead of returning early with a sentinel). Also references "5-minute tolerance" which doesn't exist in the implementation — all future timestamps clamp to zero regardless of delta.

`validate_recent_context`: set-based content matching has false-positive risk

File: compaction_validator.py lines 360, 379–380

When content is a list (multi-part tool-result or assistant messages), msg.get("content", "") returns "" for all such messages. Every empty string matches every empty string in post_content, so multi-part messages always appear "preserved" regardless of actual content. Transcripts heavy with tool calls will inflate preservation rate artificially.

Confirmed correct

TOCTOU fix: open(resolved_path) used after security check (when defined)
object.__setattr__ fully removed — plain self.age_hours = ... correct for non-frozen dataclass
All 3 silent pass blocks replaced with logger.warning() calls
__all__ backward-compat re-exports in place
Redundant _parse_timestamp_age recomputation in validate() removed
isinstance(events_data, list) type guard added
isinstance(e, dict) filter in session events comprehension added
Line count: 404 lines (≤ 420 ✅)

rysweet · 2026-03-04T04:05:58Z

Security Review — compaction_validator.py

22/22 tests pass. Security analysis complete.

🔴 CRITICAL: `resolved_path` unbound — NameError breaks fail-open contract

File: compaction_validator.py lines 256–268

try:
    resolved_path = transcript_path.resolve()   # (1) may raise ValueError/OSError
    if not resolved_path.is_relative_to(self.project_root.resolve()):
        context.has_security_violation = True
        return context
except (ValueError, OSError):
    pass                                         # (2) resolved_path never assigned

# ...
try:
    if resolved_path.exists():                  # (3) NameError — not caught below!
        with open(resolved_path) as f:
            context.pre_compaction_transcript = json.load(f)
except (json.JSONDecodeError, OSError):          # (4) does NOT catch NameError
    pass

Attack surface: An attacker controlling pre_compaction_transcript_path in compaction_events.json can supply a path that causes Path.resolve() to raise OSError (e.g., a path with null bytes on some platforms). The NameError propagates uncaught, crashing load_compaction_context. This is both a reliability failure and a minor DoS vector against the validation hook.

Fix: Initialize resolved_path = None before the try block and guard the load section:

resolved_path = None
try:
    resolved_path = transcript_path.resolve()
    ...
except (ValueError, OSError):
    pass  # resolved_path stays None → skip loading below

if resolved_path is not None:
    try:
        if resolved_path.exists():
            with open(resolved_path) as f:
                context.pre_compaction_transcript = json.load(f)
    except (json.JSONDecodeError, OSError):
        pass

🟡 FALSE-POSITIVE RISK: Empty-content messages inflate preservation rate

File: compaction_validator.py lines 499, 518–519 (validate_recent_context)

post_content = [msg.get("content", "") for msg in post_compaction]
check_content = [msg.get("content", "") for msg in messages_to_check]
preserved_count = sum(1 for content in check_content if content in post_content)

Messages lacking a "content" key (tool-result messages, system messages, etc.) return "". Since "" in ["", "", ...] is always True, every such message in check_content is counted as "preserved" regardless of actual content. In tool-call-heavy transcripts this produces an artificially high preservation rate, masking real data loss.

Severity: Logic flaw, not a security vulnerability. No exploitable attack path. Produces false negatives for data-loss detection.

🟡 DOCSTRING INACCURACY: `_parse_timestamp_age` misleads reviewers

File: compaction_validator.py line 54

Returns (0.0, False) if timestamp cannot be parsed.

The docstring omits that future timestamps beyond clock-skew tolerance also return (0.0, False). This is mentioned in code comments but not the public docstring, making the contract unclear to callers. Low severity but relevant for security review since the timestamp field comes from untrusted JSON.

✅ Security controls verified correct

Control	Location	Status
TOCTOU prevention	line 257 → line 268: same `resolved_path`	✅ Fixed
Path traversal check	line 258: `is_relative_to(project_root)`	✅ Correct
`isinstance` guard on JSON root	lines 215–216	✅ Present
`isinstance(e, dict)` filter	line 220	✅ Present
DoS guard on timestamp length	lines 57–58: `_MAX_TIMESTAMP_LEN = 35`	✅ Present
Clock-skew rejection	lines 79–80	✅ Present
Staleness constant (no magic number)	line 41: `_STALENESS_THRESHOLD_HOURS = 24`	✅ Present
Fail-open on missing file	lines 203–204	✅ Correct
Fail-open on corrupt JSON	lines 210–212	✅ Correct
Security violation sets flag + returns	lines 259–261	✅ Correct

Summary

Severity	Finding	Fixes required before merge
🔴 Critical	`resolved_path` unbound → NameError → crash	YES
🟡 Medium	False-positive preservation rate on empty content	Recommended
🟡 Low	Docstring omits future-timestamp rejection	Optional

The path traversal and TOCTOU security requirements are satisfied. The single blocker is the resolved_path unbound bug which breaks the fail-open guarantee — an attacker can trigger it to crash the validation hook entirely.

rysweet · 2026-03-04T04:07:14Z

Philosophy Guardian Review — Step 17d

Files reviewed: compaction_validator.py, compaction_context.py

✅ Ruthless Simplicity

Both modules are lean and focused. compaction_context.py (144 lines) handles only data classes and timestamp parsing. compaction_validator.py (404 lines) is slightly above the 310-line target mentioned in the extracted module's docstring but every line does real work. No dead code, no speculative abstractions.

✅ Bricks & Studs Pattern

Clean module boundary: compaction_context.py is a self-contained brick (data + utilities), imported by compaction_validator.py via a single import. __all__ re-exports the public API correctly from the top-level module. Studs are well-defined.

❌ Zero-BS — FAIL (resolved_path unbound, breaks fail-open contract)

compaction_validator.py lines 122–134:

try:
    resolved_path = transcript_path.resolve()
    ...
except (ValueError, OSError) as exc:
    logger.warning(...)
    # ← NO return here

with open(resolved_path) as f:   # ← NameError if resolve() raised

If transcript_path.resolve() raises, resolved_path is never assigned. The subsequent open(resolved_path) raises NameError, which is not caught by the outer except (FileNotFoundError, json.JSONDecodeError, OSError). This breaks the explicit fail-open guarantee stated in the module docstring. Fix: add return context after the logger.warning on line 130.

✅ No Over-Engineering

The two-strategy logic in validate_recent_context (small vs large transcript thresholds) is justified by real edge-cases in tests. Data classes are plain and direct. No feature flags, no backwards-compat shims, no premature abstraction.

✅ Clean Module Boundaries

Data lives in compaction_context.py, validation logic in compaction_validator.py. The separation is genuine, not cosmetic.

Philosophy Compliance Summary

Principle	Status	Notes
Ruthless simplicity	✅ Pass	Minor: validator is slightly over 310-line target
Bricks & studs	✅ Pass	Clean extraction, `__all__` correct
Zero-BS	❌ FAIL	`resolved_path` unbound breaks fail-open contract
No over-engineering	✅ Pass
Clean module boundaries	✅ Pass

Overall: NOT COMPLIANT — one Zero-BS violation must be fixed before merge.

Required fix: compaction_validator.py line ~130 — add return context after logger.warning("Could not resolve transcript path...") to ensure fail-open behavior when path resolution raises.

Secondary items (not blocking, but should be addressed):

compaction_context.py:57: magic number 24 should be _STALENESS_THRESHOLD_HOURS constant as specified
_parse_timestamp_age docstring says "rejects" future timestamps but code clamps them to 0.0
validate_recent_context: empty-string content matching creates false-positive "100% preserved" for tool-call-heavy transcripts

🔒 Philosophy Guardian sign-off: BLOCKED pending resolved_path fix

- Fix `resolved_path` unbound NameError in compaction_validator.py: initialize to None before try block and return context on path-resolution failure, preserving fail-open contract - Add _STALENESS_THRESHOLD_HOURS = 24 constant in compaction_context.py, replacing magic number per spec requirement - Fix docstring inaccuracy: _parse_timestamp_age clamps (not rejects) future timestamps - Fix false-positive in validate_recent_context: exclude empty/non-string content from membership check to prevent tool-call-heavy transcripts from inflating preservation rate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-03-04T04:10:47Z

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

github-actions · 2026-03-04T04:13:50Z

Repo Guardian - Passed

All files in this PR are durable repository content (production code, tests, and documentation). No ephemeral content detected.

AI generated by Repo Guardian

rysweet · 2026-03-04T04:15:37Z

Step 19a: Philosophy Compliance Review — PASSED

Reviewed commit: `764ab7ca` (fix: address blocking issues from PR #2867 review)

Files reviewed

`.claude/tools/amplihack/hooks/compaction_context.py` (146 lines)
`.claude/tools/amplihack/hooks/compaction_validator.py` (401 lines)

Compliance findings

Principle	Status	Notes
Ruthless simplicity	✅ Pass	`compaction_context.py` 146 lines, single responsibility. Validator reduced 523→401 lines (~23%).
Bricks & studs	✅ Pass	Clean extraction, `__all__` re-exports backward compat correctly.
Zero-BS / Fail-open	✅ Pass	`resolved_path = None` init (B1) ensures no NameError; `return context` on exception preserves fail-open.
No over-engineering	✅ Pass	`__post_init__` for computed fields is idiomatic Python, not over-engineering.
No magic numbers	✅ Pass	`_STALENESS_THRESHOLD_HOURS = 24` constant added (Q1).
Docstring accuracy	✅ Pass	"Clamps future timestamps" wording corrected (Q2).
Clean module boundaries	✅ Pass	Data in `compaction_context.py`, validation logic in `compaction_validator.py`.
Logging	✅ Pass	All 3 silent `pass` blocks replaced with `logger.warning(...)` using lazy `%r`/`%s` format strings.

Test results (confirmed locally)

tests/test_pre_compact_hook.py          9 passed
tests/test_context_preservation.py     12 passed
tests/test_session_start_integration.py 11 passed
                                        --------
Total compaction-related:               32 passed, 15 pre-existing skips

Pre-existing unrelated failures (34 collection errors from missing optional deps) are confirmed NOT introduced by this refactor.

Summary

Philosophy compliance: PASSED. All blocking (B1) and non-blocking (Q1, Q2, Q3) review feedback from commit 764ab7ca is correctly implemented. No further changes needed before merge.

rysweet · 2026-03-04T06:55:52Z

Extended Outside-In Testing Report for PR #2867

Reviewer: Claude Opus 4.6 (automated)
Branch: feat/2821-recipe-in-recipe-step
Total tests executed: 115 | Passed: 115 | Failed: 0

Existing Tests (84 tests) -- ALL PASS

Test Suite	Tests	Result
`outside-in/test_compaction_refactor_outside_in.py` (PR-included)	10	PASS
`test_compaction_context.py` (unit)	24	PASS
`test_compaction_refactor.py` (structural)	16	PASS
`test_compaction_validator.py` (unit)	22	PASS
`test_power_steering_compaction.py` (integration)	12	PASS

Additional Tests Written (31 tests) -- ALL PASS

Filed as outside-in/test_compaction_refactor_extended_outside_in.py. Each test runs as a subprocess to exercise the real import chain.

Scenario	Tests	Result	Description
Corrupt Data	3	PASS	Corrupt JSON, non-list events, non-dict items in list
Empty Transcripts	3	PASS	None transcript, empty list, empty list with compaction event
Path Traversal Security	2	PASS	Absolute path traversal (`/etc/passwd`) and relative (`../../etc/passwd`)
Stale Event Detection	3	PASS	>24h marked stale, <24h not stale, stale shows in diagnostic summary
_parse_timestamp_age Edge Cases	4	PASS	Offset-aware (+05:30), 35-char boundary, 36-char rejection, boolean input
Validator E2E: Data Loss	3	PASS	TODO loss detected, objective loss detected, no-loss passes
get_summary() Output	3	PASS	Passed with event, failed with warnings/recovery, plain passed
post_init Edge Cases	3	PASS	No-event skips computation, None timestamp default, garbage timestamp default
Backward Compatibility	1	PASS	power_steering_checker import chain works
all Exports	2	PASS	Correct exports, identity check (same object from both modules)
Security Violation Diagnostics	1	PASS	Violation appears in diagnostic summary
Multiple Events	1	PASS	Most recent event selected (by timestamp sort)
Thread Safety	1	PASS	16 concurrent CompactionContext instantiations via ThreadPoolExecutor
Fallback Behavior	1	PASS	`used_fallback=True` when pre-compaction transcript unavailable

Key Findings

1. Import Compatibility: PASS

from compaction_context import CompactionContext, ValidationResult works
from compaction_validator import CompactionContext, ValidationResult works (backward compat)
Both import paths resolve to the same class object (assertIs verified)
power_steering_checker.py import chain intact

2. Dataclass Behavior: PASS

__post_init__ correctly computes age_hours and is_stale when has_compaction_event=True and timestamp is valid
Skips computation when has_compaction_event=False (even with timestamp set)
Returns safe defaults (0.0, False) for None/garbage timestamps
No object.__setattr__ usage (verified by source inspection)
Mutable (not frozen) -- direct attribute assignment works

3. Validator Functionality: PASS

validate() returns passed=True when no compaction detected
TODO loss detection works end-to-end (pre-compaction TODOs vs post-compaction)
Objective loss detection works end-to-end
Recent context preservation validation works
Multiple events correctly sorted by timestamp, most recent used
Fallback to provided transcript when pre-compaction unavailable

4. Error Handling (error swallowing fix): PASS

Missing transcript file now emits logger.warning() instead of silent pass
Sort TypeError on non-sortable timestamps emits logger.warning()
Path resolve failures emit logger.warning()
All error paths still return fail-open (passed=True)
AST analysis confirms zero except ... pass blocks in both files

5. Edge Cases: PASS

Corrupt JSON: fails open, returns no-compaction context
Non-list events data (dict): fails open
Non-dict items in events list: silently skipped
None transcript: passes with no warnings
Empty list transcript: passes with no warnings
Path traversal (absolute and relative): has_security_violation=True, transcript not loaded
Stale events (>24h): is_stale=True, appears in diagnostic summary
Thread safety: 16 concurrent instantiations with no errors

6. Structural Constraints: PASS

compaction_validator.py is <= 420 lines (was 523)
No @dataclass classes remain in compaction_validator.py
No redundant _parse_timestamp_age recomputation in validator
__all__ includes all three public symbols

Verdict

PASS -- All 115 tests pass. The refactoring correctly extracts dataclasses to compaction_context.py while maintaining full backward compatibility, improving error visibility, and preserving fail-open semantics. No regressions found.

rysweet · 2026-03-04T15:46:05Z

Outside-In Testing Results (PR #2867)

Branch: feat/2821-recipe-in-recipe-step
Commit: 987e14b8
Method: uvx --from git+https://github.com/rysweet/amplihack.git@feat/2821-recipe-in-recipe-step amplihack <command> + cloned branch for module-level tests
Date: 2026-03-04

Scenario 1: Basic CLI Functionality (uvx)

Command: uvx --from git+<remote>@<branch> amplihack --help
Result: PASS

CLI installs (135 packages) and runs correctly
All 15 subcommands present: install, uninstall, launch, claude, RustyClawd, copilot, codex, amplifier, uvx-help, plugin, memory, new, recipe, mode
Help text renders cleanly with no import errors

Scenario 2: Recipe List (uvx - exercises parser + models)

Command: uvx --from git+<remote>@<branch> amplihack recipe list
Result: PASS

Lists 16 recipes successfully
Parser loads and parses all YAML recipe files without errors
No regressions in recipe discovery or display

Scenario 3: Recipe Validate (uvx - exercises parser changes)

Command: uvx --from git+<remote>@<branch> amplihack recipe validate <path>
Result: PASS

default-workflow.yaml: Valid
smart-orchestrator.yaml: Valid
Custom recipe with new StepType.RECIPE step: Valid

Scenario 4: New StepType.RECIPE Parsing (uvx)

Command: Created YAML with type: recipe step, validated via uvx
Result: PASS

Parser correctly recognizes type: recipe as a valid step type
Recipe steps with recipe field are correctly inferred as StepType.RECIPE
amplihack recipe show displays recipe steps as (recipe) type
amplihack recipe run --dry-run executes all 3 step types (bash, recipe, agent) in dry mode

Scenario 5: Compaction Module Backward Compatibility (cloned branch)

12 custom tests run from branch clone
Result: ALL 12 PASS

#	Test	Result
1	New import path: `from compaction_context import CompactionContext, ValidationResult`	PASS
2	Backward compat: `from compaction_validator import CompactionContext, ValidationResult`	PASS
3	Same class objects from both import paths (`is` identity)	PASS
4	`CompactionContext.__post_init__` computes `age_hours` and `is_stale`	PASS
5	`ValidationResult` construction with correct fields	PASS
6	`ValidationResult.get_summary()` returns "Validation: PASSED"	PASS
7	`_parse_timestamp_age()` with valid timestamp	PASS
8	`_parse_timestamp_age()` with bad timestamp -> fail-open `(0.0, False)`	PASS
9	`_parse_timestamp_age()` with non-string -> fail-open `(0.0, False)`	PASS
10	`CompactionContext.get_diagnostic_summary()`	PASS
11	`CompactionValidator` instantiates from `compaction_validator` module	PASS
12	`compaction_validator.__all__` exports exactly `{CompactionValidator, CompactionContext, ValidationResult}`	PASS

Scenario 6: Error Surfacing (silent pass -> logger.warning)

Result: PASS

Missing compaction events file: fail-open, no crash
Corrupt JSON in events file: fail-open, no crash
Missing transcript file: fail-open, event still detected

Scenario 7: PR's Own Outside-In Tests (branch clone)

Command: pytest .claude/tools/amplihack/hooks/tests/outside-in/test_compaction_refactor_outside_in.py -v
Result: 10/10 PASS in 0.46s

TestImportContract::test_compaction_context_importable_from_new_module PASSED
TestImportContract::test_validation_result_importable_from_new_module PASSED
TestImportContract::test_compaction_context_backward_compat_from_validator PASSED
TestImportContract::test_validation_result_backward_compat_from_validator PASSED
TestImportContract::test_post_init_computes_age_no_object_setattr PASSED
TestImportContract::test_parse_timestamp_age_importable PASSED
TestErrorSurfacing::test_missing_transcript_emits_warning_not_silent PASSED
TestErrorSurfacing::test_sort_type_error_emits_warning PASSED
TestErrorSurfacing::test_validate_no_compaction_returns_passed PASSED
TestErrorSurfacing::test_compaction_context_get_diagnostic_summary PASSED

Scenario 8: Full Compaction Unit Test Suite (branch clone)

Command: pytest test_compaction_context.py test_compaction_refactor.py test_compaction_validator.py -v
Result: 62/62 PASS in 0.13s

Scenario 9: Recipe Step Type Tests (branch clone)

Command: pytest tests/unit/recipes/test_recipe_step_type.py -v
Result: 19/19 PASS in 0.12s

Covers: enum member, value round-trip, dataclass fields, parser explicit/inferred types, validation warnings, runner happy path, context merging, failure propagation, missing recipe field, recipe not found, recursion depth guard (MAX_DEPTH=3), dry-run recipe steps.

Summary

Category	Tests	Result
CLI via uvx (Scenarios 1-4)	4 scenarios	ALL PASS
Compaction backward compat (Scenario 5)	12 tests	ALL PASS
Error surfacing (Scenario 6)	3 tests	ALL PASS
PR's own outside-in tests (Scenario 7)	10 tests	ALL PASS
Full compaction unit suite (Scenario 8)	62 tests	ALL PASS
Recipe step type tests (Scenario 9)	19 tests	ALL PASS
Total	110 tests across 9 scenarios	ALL PASS

No regressions detected. Both the compaction_validator refactor and the recipe-in-recipe step type changes work correctly as an external user.

Generated by outside-in-testing skill via uvx --from git+https://github.com/rysweet/amplihack.git@feat/2821-recipe-in-recipe-step

rysweet · 2026-03-05T02:20:32Z

Shadow Test Results: PR #2867 (compaction_validator.py refactor)

Test environment: Linux 6.17.0, Python 3.13.7, Claude Code 2.1.66
Branch: feat/2821-recipe-in-recipe-step
Date: 2026-03-05

Summary

Test	Status	Details
Test 1: Session triggering compaction	PASS (partial)	Session completed, no hook errors. Compaction not triggered (session too short).
Test 2: Hook imports	PASS	All imports work correctly from both modules.
Test 3: Multi-turn session	INCONCLUSIVE	Nested Claude session timed out (environment limitation).
Test 4: Hook error logs	PASS	No compaction-related errors in any logs.
Test 5: Backward compatibility	PASS	All validation methods, security checks, and data handling work correctly.
PR test suite	PASS	62/62 tests passed (24 context + 16 refactor + 10 outside-in + 22 validator)

Detailed Results

Test 1: Session That Triggers Compaction

Session completed without errors (exit code 0)
No stderr output (no hook errors)
Compaction was NOT triggered (single -p call produces ~83 lines, well below compaction threshold)
Note: The PreCompact hook is NOT registered in settings.json (only PostToolUse, PreToolUse, SessionStart are configured). This means pre_compact.py would not fire even in sessions long enough for compaction. This appears to be a pre-existing issue, not introduced by this PR.

Test 2: Hook Imports (PASS)

2a - Import from new module:

CompactionContext fields: ['has_compaction_event', 'turn_at_compaction', 'messages_removed',
  'pre_compaction_transcript', 'timestamp', 'is_stale', 'age_hours', 'has_security_violation']
ValidationResult fields: ['passed', 'warnings', 'recovery_steps', 'compaction_context', 'used_fallback']
PASS: All imports work from compaction_context

2b - Backward compatibility (import from compaction_validator):

CompactionContext and ValidationResult are the SAME object in both modules (identity check passed)
__all__ correctly exports: ['CompactionValidator', 'CompactionContext', 'ValidationResult']
PASS: Backward compatibility preserved

2c - Security hardening in _parse_timestamp_age:
All new security checks passed:

Non-string input (int, None) -> (0.0, False)
Oversized input (100 chars) -> rejected
Future timestamp -> clamped to 0.0
Very old timestamp (>10yr) -> rejected
Invalid format -> graceful fallback
Empty string -> graceful fallback

Test 3: Multi-Turn Session (INCONCLUSIVE)

Nested Claude Code sessions cannot run inside an active Claude Code session (environment limitation). The process accepted the command but produced no output before timing out. No errors were emitted on stderr, which confirms the hook loading is clean.

Test 4: Hook Error Logs (PASS)

No compaction-related errors in Claude runtime logs
No ImportError, AttributeError, or TypeError from refactored modules
XPIA security logs show clean session start with no compaction warnings
Session JSONL files contain no compaction error entries

Test 5: Backward Compatibility Under Load (PASS)

All sub-tests passed:

5a: No events file -> graceful fail-open (has_compaction_event=False)
5b: With compaction event -> correctly loads event, computes age_hours and is_stale
5c: Multiple events, different sessions -> filters correctly by session_id
5d: Invalid events data (dict instead of list) -> graceful handling (new guard in PR)
5e: Events with non-dict entries -> filters non-dict entries, finds valid event (new guard in PR)
5f: validate_todos (preserved) -> passed=True
5g: validate_todos (missing) -> passed=False with recovery steps
5h: validate_objectives -> works correctly
5i: validate_recent_context -> works correctly
5j: Full validate() method -> passed=True with compaction context
5k: validate() with None transcript -> graceful handling
5l: validate() with empty transcript -> graceful handling
5m: Path traversal (/etc/passwd) -> has_security_violation=True
5n: Relative path traversal (../../../etc/shadow) -> has_security_violation=True
5o: ValidationResult summaries -> correct output for passed and failed cases

PR Test Suite (62/62 PASS)

test_compaction_context.py:           24/24 passed (0.06s)
test_compaction_refactor.py:          16/16 passed (0.06s)
test_compaction_refactor_outside_in:  10/10 passed (0.45s)
test_compaction_validator.py:         22/22 passed (0.08s)

Issues Found

FINDING 1: `amplihack install` Does Not Copy `compaction_context.py` (MEDIUM)

The uvx --from git+...@branch amplihack install command successfully installed the PR branch, but the new compaction_context.py file was NOT copied to ~/.claude/tools/amplihack/hooks/. The installed compaction_validator.py was the main branch version (still contains inline dataclasses), not the PR branch version.

This means the refactor would not take effect for users who install via amplihack install. The installer likely has an allowlist of files to copy, or uses a snapshot that does not include new files added in the PR.

Workaround: Manual copy of PR files to the hooks directory works correctly.

FINDING 2: PreCompact Hook Not Registered in settings.json (PRE-EXISTING)

The settings.json only registers PostToolUse, PreToolUse, and SessionStart hooks. The PreCompact hook is not configured, meaning pre_compact.py (and by extension the compaction validator) would never fire during real Claude Code sessions. This is a pre-existing issue not introduced by this PR.

FINDING 3: `__post_init__` Uses Direct Assignment (BEHAVIORAL CHANGE)

The old code used object.__setattr__() in __post_init__, the new code uses direct assignment (self.age_hours = age_hours). The PR's own test test_post_init_uses_direct_assignment_not_object_setattr explicitly verifies this change. This is a simplification that works correctly since CompactionContext is not a frozen dataclass.

Conclusion

The refactor is functionally correct -- all imports, backward compatibility, validation logic, and security checks work properly. The 62 test suite passes cleanly. The main concern is Finding 1 (installer not copying the new file), which would prevent the refactor from taking effect in production installations.

github-actions · 2026-03-05T13:16:16Z

Triage Report - STANDARD REVIEW

Risk Level: LOW
Priority: MEDIUM
Status: Ready for review

Analysis

Changes: +1,478/-157 across 11 files (6 commits)
Type: Refactoring with improvements
Age: 33.3 hours

Improvements

✅ Extracts CompactionContext/ValidationResult to dedicated module
✅ Reduces compaction_validator.py from 523 → 401 lines
✅ Replaces 3 silent pass blocks with proper logging
✅ Removes unnecessary object.__setattr__ pattern

Risk Assessment

Low risk - clean module extraction with error handling improvements.

Next Steps

Review extracted module structure
Verify error logging improvements work correctly
Check backward compatibility
Standard code review process

Recommendation: STANDARD_REVIEW - straightforward refactor with quality improvements.

Context: Part of issue #2845 quality audit initiative.

AI generated by PR Triage Agent

…ntext.py (issue #2845) - Extract _parse_timestamp_age, CompactionContext, ValidationResult to compaction_context.py - compaction_validator.py reduced from 523 to 401 lines - Replace 3 silent pass blocks with logger.warning() calls - Remove unnecessary object.__setattr__ on non-frozen dataclass - Remove redundant _parse_timestamp_age recomputation (post_init handles it) - Add __all__ export for backward compatibility - All 34 compaction tests pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…2845) - Add test_compaction_refactor_outside_in.py: 10 subprocess-based black-box tests verifying user-visible behavior of the compaction_validator refactor - Add test_compaction_context.py: 24 TDD unit tests for the new compaction_context.py module - Add test_compaction_refactor.py: 16 regression tests for the refactored compaction_validator.py All 10 outside-in tests pass; total compaction test suite: 62 passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix `resolved_path` unbound NameError in compaction_validator.py: initialize to None before try block and return context on path-resolution failure, preserving fail-open contract - Add _STALENESS_THRESHOLD_HOURS = 24 constant in compaction_context.py, replacing magic number per spec requirement - Fix docstring inaccuracy: _parse_timestamp_age clamps (not rejects) future timestamps - Fix false-positive in validate_recent_context: exclude empty/non-string content from membership check to prevent tool-call-heavy transcripts from inflating preservation rate Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-03-05T15:11:18Z

🤖 Auto-fixed version bump

The version in pyproject.toml has been automatically bumped to the next patch version.

If you need a minor or major version bump instead, please update pyproject.toml manually and push the change.

github-actions · 2026-03-05T15:13:54Z

Repo Guardian - Passed

✅ All files in this PR are durable code assets. No ephemeral content detected.

Files reviewed:

.claude/tools/amplihack/hooks/compaction_context.py - New production module
.claude/tools/amplihack/hooks/compaction_validator.py - Modified production module
.claude/tools/amplihack/hooks/tests/outside-in/test_compaction_refactor_outside_in.py - Test suite
.claude/tools/amplihack/hooks/tests/test_compaction_context.py - Test suite
.claude/tools/amplihack/hooks/tests/test_compaction_refactor.py - Test suite
pyproject.toml - Configuration file

All files represent permanent codebase infrastructure (production code, test coverage, and project configuration).

AI generated by Repo Guardian

rysweet commented Mar 4, 2026

View reviewed changes

github-actions bot mentioned this pull request Mar 4, 2026

[PR Triage Report] PR Triage Report: 8 Open PRs - 5 Refactoring (Issue #2845), 2 Features, 1 Docs #2878

Closed

github-actions bot mentioned this pull request Mar 5, 2026

feat: distributed hive mind with DHT sharding #2876

Open

6 tasks

Ubuntu and others added 3 commits March 5, 2026 15:10

rysweet force-pushed the feat/2821-recipe-in-recipe-step branch from 987e14b to 2d4e442 Compare March 5, 2026 15:10

[skip ci] chore: Auto-bump patch version

cc00b48

Conversation

rysweet commented Mar 4, 2026

Summary

Step 13: Local Testing Results

Scenario 1 — Import contract: dataclass extraction to compaction_context.py

Scenario 2 — Error surfacing: silent pass → logger.warning + fail-open regression

Test plan

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

github-actions bot commented Mar 4, 2026

Repo Guardian - Passed

Uh oh!

rysweet left a comment

Choose a reason for hiding this comment

Code Review — compaction_validator refactor (issue #2845)

🔴 BUG: resolved_path may be unbound — breaks fail-open contract

🟡 P1 requirement not applied: magic number 24 still present

🟡 Docstring inaccuracy in _parse_timestamp_age

🟡 validate_recent_context: set-based content matching has false-positive risk

✅ Confirmed correct

Uh oh!

rysweet left a comment

Choose a reason for hiding this comment

Code Review — compaction_validator refactor (issue #2845)

BUG: resolved_path may be unbound — breaks fail-open contract

P1 requirement not applied: magic number 24 still present

Docstring inaccuracy in _parse_timestamp_age

validate_recent_context: set-based content matching has false-positive risk

Confirmed correct

Uh oh!

rysweet commented Mar 4, 2026

Code Review — compaction_validator refactor (issue #2845)

BUG: resolved_path may be unbound — breaks fail-open contract

P1 requirement not applied: magic number 24 still present

Docstring inaccuracy in _parse_timestamp_age

validate_recent_context: set-based content matching has false-positive risk

Confirmed correct

Uh oh!

rysweet commented Mar 4, 2026

Security Review — compaction_validator.py

🔴 CRITICAL: resolved_path unbound — NameError breaks fail-open contract

🟡 FALSE-POSITIVE RISK: Empty-content messages inflate preservation rate

🟡 DOCSTRING INACCURACY: _parse_timestamp_age misleads reviewers

✅ Security controls verified correct

Summary

Uh oh!

rysweet commented Mar 4, 2026

Philosophy Guardian Review — Step 17d

✅ Ruthless Simplicity

✅ Bricks & Studs Pattern

❌ Zero-BS — FAIL (resolved_path unbound, breaks fail-open contract)

✅ No Over-Engineering

✅ Clean Module Boundaries

Philosophy Compliance Summary

Uh oh!

github-actions bot commented Mar 4, 2026

Uh oh!

github-actions bot commented Mar 4, 2026

Repo Guardian - Passed

Uh oh!

rysweet commented Mar 4, 2026

Step 19a: Philosophy Compliance Review — PASSED

Files reviewed

Compliance findings

Test results (confirmed locally)

Summary

Uh oh!

rysweet commented Mar 4, 2026

Extended Outside-In Testing Report for PR #2867

Existing Tests (84 tests) -- ALL PASS

Additional Tests Written (31 tests) -- ALL PASS

Key Findings

Verdict

Uh oh!

rysweet commented Mar 4, 2026

Outside-In Testing Results (PR #2867)

Scenario 1: Basic CLI Functionality (uvx)

Scenario 2: Recipe List (uvx - exercises parser + models)

Scenario 3: Recipe Validate (uvx - exercises parser changes)

🔴 BUG: `resolved_path` may be unbound — breaks fail-open contract

🟡 P1 requirement not applied: magic number `24` still present

🟡 Docstring inaccuracy in `_parse_timestamp_age`

🟡 `validate_recent_context`: set-based content matching has false-positive risk

BUG: `resolved_path` may be unbound — breaks fail-open contract

P1 requirement not applied: magic number `24` still present

Docstring inaccuracy in `_parse_timestamp_age`

`validate_recent_context`: set-based content matching has false-positive risk

🔴 CRITICAL: `resolved_path` unbound — NameError breaks fail-open contract

🟡 DOCSTRING INACCURACY: `_parse_timestamp_age` misleads reviewers

FINDING 1: `amplihack install` Does Not Copy `compaction_context.py` (MEDIUM)

FINDING 3: `__post_init__` Uses Direct Assignment (BEHAVIORAL CHANGE)