Problem
src/millstone/policy/schemas.py contains complex multi-layer fallback parsing logic (JSON → regex → heuristics) for:
parse_review_decision
parse_sanity_result
parse_builder_completion
parse_design_review
These are safety gates — a parsing bug could cause millstone to approve bad code or reject good code. The fallback paths are currently untested at the unit level; they're only exercised indirectly through integration tests.
Fix
Add tests/unit/test_schemas.py (~200 LoC) covering each parser with:
- Valid structured JSON input
- JSON embedded in prose ("Here is my review:\n
json\n{...}")
- Malformed JSON that triggers regex fallback
- Completely invalid/garbage input
- Edge cases: empty string, whitespace-only, non-ASCII
This ensures the fallback chain works correctly and any changes to parsing logic are caught immediately.