Skip to content

feat: LCD-anchored confidence scoring algorithm (v2)#32

Merged
rsalus merged 7 commits intomainfrom
confidence-algorithm-v2
Feb 22, 2026
Merged

feat: LCD-anchored confidence scoring algorithm (v2)#32
rsalus merged 7 commits intomainfrom
confidence-algorithm-v2

Conversation

@rsalus
Copy link
Contributor

@rsalus rsalus commented Feb 22, 2026

Summary

Replaces the naive 3-tier fixed confidence scoring (90%/70%/60%) with a weighted, LCD-anchored algorithm that evaluates individual policy criteria against extracted clinical evidence. Introduces a PolicyRegistry with 5 real CMS Local Coverage Determination seed policies and a generic fallback, enabling granular confidence scoring with bypass logic, hard gates, and empirically-derived recommendations.

Changes

  • PolicyCriterion / PolicyDefinition models — Pydantic models with weight, required flag, LCD section reference, and bypass lists
  • Confidence scorer — Weighted algorithm: Σ(weight × status × confidence) / Σ(weight × confidence) with hard gate penalties for required NOT_MET criteria and configurable floor/ceiling
  • PolicyRegistry — Singleton registry resolving CPT codes to LCD-backed policies with generic fallback
  • 5 LCD seed policies — MRI Lumbar (L34220), MRI Brain (L37373), TKA (L36575), Physical Therapy (L34049), Epidural Steroid (L39240)
  • Evidence extractor enhancement — Accepts PolicyDefinition/dict union, includes LCD section context in LLM prompts, parses confidence levels from LLM output
  • Form generator refactor — Delegates scoring to confidence_scorer, includes policy_id/lcd_reference in PAFormResponse
  • Analyze endpoint cleanup — Uses registry.resolve() instead of hardcoded policy, removes unsupported CPT rejection

Test Plan

  • 93 Intelligence tests passing (29 baseline + 64 new)
  • PolicyCriterion/PolicyDefinition model validation (5 tests)
  • Confidence scorer edge cases: all MET, all NOT_MET, bypass logic, hard gates, empty evidence (13 tests)
  • Generic policy builder (6 tests)
  • Registry resolve with seed/fallback (7 tests)
  • All 5 seed policies validated via parametrize (19 tests)
  • PAFormResponse backward compatibility (4 tests)
  • Evidence extractor LCD context injection (5 new tests)
  • Form generator delegation to scorer (4 new tests)
  • Analyze endpoint registry integration (2 new tests)

Results: 93 tests pass | +1,266 / -163 lines | 23 files
Design: docs/designs/2026-02-22-confidence-algorithm-v2.md
Plan: docs/plans/2026-02-22-confidence-algorithm-v2.md

rsalus and others added 6 commits February 21, 2026 19:36
… policy

- T001: PolicyCriterion + PolicyDefinition Pydantic models with LCD metadata
- T002: Weighted LCD compliance scoring algorithm with bypass logic
- T003: Generic fallback policy builder for unsupported procedure codes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- T001: PolicyCriterion + PolicyDefinition data models
- T003: Generic fallback policy for unsupported procedure codes
- T004: PolicyRegistry with resolve() and generic fallback
- T007: 5 seed policies (MRI Lumbar L34220, MRI Brain L37373, TKA L36575, PT L34049, ESI L39240)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- T005: Add optional policy_id and lcd_reference to PAFormResponse
- T006: Evidence extractor accepts PolicyDefinition, includes LCD context in prompts, parses confidence signals

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
# Conflicts:
#	apps/intelligence/src/models/policy.py
- T008: Form generator delegates scoring to ConfidenceScorer, includes policy metadata
  - generate_form_data now accepts PolicyDefinition instead of dict
  - Recommendation and confidence_score come from calculate_confidence()
  - PAFormResponse includes policy_id and lcd_reference fields
- T009: Analyze endpoint uses PolicyRegistry.resolve() instead of hardcoded policy
  - Removed SUPPORTED_PROCEDURE_CODES gate (unknown CPTs get generic policy)
  - Removed EXAMPLE_POLICY import and dead _build_field_mappings helper
  - evidence_extractor accepts PolicyDefinition | dict via _normalize_criteria()
- T010: All 34 tests pass

Prerequisite modules created:
  - src/models/policy.py: PolicyCriterion, PolicyDefinition
  - src/reasoning/confidence_scorer.py: ScoreResult, calculate_confidence
  - src/policies/generic_policy.py: build_generic_policy
  - src/policies/registry.py: PolicyRegistry with 5 seed LCD policies

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor Author

rsalus commented Feb 22, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@coderabbitai
Copy link

coderabbitai bot commented Feb 22, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Introduces LCD-backed PolicyDefinition models, a PolicyRegistry with seeded policies and generic fallback, a weighted confidence scorer, and wiring to use registry-resolved policies in evidence extraction, scoring, and form generation; removes hard-coded CPT validation and related mapping helpers.

Changes

Cohort / File(s) Summary
Policy Models & PA Form
apps/intelligence/src/models/policy.py, apps/intelligence/src/models/pa_form.py
Adds PolicyCriterion and PolicyDefinition models; extends PAFormResponse with optional policy_id and lcd_reference.
Policy Registry & Generic Fallback
apps/intelligence/src/policies/registry.py, apps/intelligence/src/policies/generic_policy.py
Implements in-memory PolicyRegistry with register/resolve and a build_generic_policy fallback for unknown CPTs; exposes module-level registry.
Seed Policies
apps/intelligence/src/policies/seed/__init__.py, apps/intelligence/src/policies/seed/*.py
Adds multiple LCD-backed seed policy modules (mri_lumbar, mri_brain, tka, physical_therapy, epidural_steroid) and auto-registration helper register_all_seeds.
Evidence Extraction
apps/intelligence/src/reasoning/evidence_extractor.py
Updates evaluation/extraction to accept PolicyCriterion/PolicyDefinition, include lcd_section in prompts, parse explicit confidence signals, add clinical summary helper, and preserve bounded concurrency.
Confidence Scorer
apps/intelligence/src/reasoning/confidence_scorer.py
Adds weighted scoring with bypass handling, gating for required NOT_MET criteria, score clamping, and recommendation mapping; exports ScoreResult and calculate_confidence.
Form Generation
apps/intelligence/src/reasoning/form_generator.py
Switches to PolicyDefinition input; delegates recommendation to calculate_confidence; populates policy_id and lcd_reference on output; removes inline eligibility logic.
API / Analyze Flow
apps/intelligence/src/api/analyze.py
Removes hard-coded SUPPORTED_PROCEDURE_CODES and _build_field_mappings; resolves policies via registry.resolve(procedure_code) and uses policy-driven evidence/form flows.
Tests
apps/intelligence/src/tests/*
Adds/updates tests for policy models, registry, seed policies, confidence scorer, evidence extractor, form generator, PAFormResponse model, and analyze API behavior to reflect registry/generic policy and scorer behavior.
Other
apps/gateway/Gateway.API/Services/PostgresPARequestStore.cs
Adjusts PA request ID generation logic to iterate existing IDs and compute next sequential counter, filtering non-conforming IDs.

Sequence Diagram

sequenceDiagram
    participant Client
    participant AnalyzeAPI as Analyze API
    participant Registry as PolicyRegistry
    participant Extractor as EvidenceExtractor
    participant LLM
    participant Scorer as ConfidenceScorer
    participant Generator as FormGenerator

    Client->>AnalyzeAPI: POST /analyze (clinical_bundle, procedure_code)
    AnalyzeAPI->>Registry: resolve(procedure_code)
    Registry-->>AnalyzeAPI: PolicyDefinition
    AnalyzeAPI->>Extractor: extract_evidence(clinical_bundle, policy)
    Extractor->>LLM: evaluate_criterion (concurrent per criterion)
    LLM-->>Extractor: assessment + confidence
    Extractor-->>AnalyzeAPI: list[EvidenceItem]
    AnalyzeAPI->>Scorer: calculate_confidence(evidence, policy)
    Scorer-->>AnalyzeAPI: ScoreResult(score, recommendation)
    AnalyzeAPI->>Generator: generate_form_data(clinical_bundle, evidence, policy)
    Generator-->>AnalyzeAPI: PAFormResponse (includes policy_id, lcd_reference)
    AnalyzeAPI-->>Client: PAFormResponse
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

Poem

🌿 Policies once fixed in stone now roam,
Registry seeds give each CPT a home.
LLMs whisper status, confidences rise,
Scorer weighs truth where bypass logic lies.
Forms return with LCD IDs — onward we comb! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly captures the main change: introducing an LCD-anchored confidence scoring algorithm with weighted criteria evaluation and policy registry.
Description check ✅ Passed Description is well-structured and directly related to the changeset, covering the new models, confidence scorer, policy registry, seed policies, and integration points.
Docstring Coverage ✅ Passed Docstring coverage is 82.02% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch confidence-algorithm-v2

Comment @coderabbitai help to get the list of available commands and usage tips.

@rsalus rsalus changed the title feat: add policy data models, confidence scorer, and generic fallback policy feat: LCD-anchored confidence scoring algorithm (v2) Feb 22, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (4)
apps/intelligence/src/reasoning/confidence_scorer.py (1)

17-21: Consider using Pydantic BaseModel for consistency.

The codebase uses Pydantic for data validation (per coding guidelines). While dataclass works fine for this simple container, using BaseModel would maintain consistency and provide validation/serialization for free.

♻️ Optional: Convert to Pydantic model
-from dataclasses import dataclass
+from pydantic import BaseModel


-@dataclass
-class ScoreResult:
-    score: float
-    recommendation: Literal["APPROVE", "MANUAL_REVIEW", "NEED_INFO"]
+class ScoreResult(BaseModel):
+    score: float
+    recommendation: Literal["APPROVE", "MANUAL_REVIEW", "NEED_INFO"]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/reasoning/confidence_scorer.py` around lines 17 - 21,
Replace the dataclass ScoreResult with a Pydantic BaseModel to align with
project validation/serialization conventions: convert the class named
ScoreResult to inherit from pydantic.BaseModel, keep the fields score: float and
recommendation: Literal["APPROVE","MANUAL_REVIEW","NEED_INFO"], and add any
needed type validators or field constraints (e.g., ensure score is within
expected range) so serialization and validation behave like other models in the
codebase.
apps/intelligence/src/tests/test_confidence_scorer.py (1)

8-9: Parameter id shadows builtin.

Minor style issue: id shadows the Python builtin. Consider renaming to criterion_id for clarity.

Rename parameter
-def _make_criterion(id: str, weight: float, required: bool = False, bypasses: list[str] | None = None) -> PolicyCriterion:
-    return PolicyCriterion(id=id, description=f"Test {id}", weight=weight, required=required, bypasses=bypasses or [])
+def _make_criterion(criterion_id: str, weight: float, required: bool = False, bypasses: list[str] | None = None) -> PolicyCriterion:
+    return PolicyCriterion(id=criterion_id, description=f"Test {criterion_id}", weight=weight, required=required, bypasses=bypasses or [])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/tests/test_confidence_scorer.py` around lines 8 - 9,
The helper function _make_criterion currently uses parameter name id which
shadows the built-in; rename the parameter to criterion_id (and update its type
hint) and replace all uses inside the function (e.g., the
PolicyCriterion(id=...) argument) and any test calls in this file to use
criterion_id instead of id so the code no longer hides the builtin and remains
clear; ensure the function signature and all references (calls to
_make_criterion) are updated consistently.
apps/intelligence/src/policies/registry.py (1)

13-15: Consider warning on CPT collision during registration.

If two policies claim the same CPT code, the second silently overwrites the first. This is likely fine for the current seed-only use case, but could cause subtle bugs if policies are dynamically registered later.

Optional: Add collision detection
     def register(self, policy: PolicyDefinition) -> None:
         for cpt in policy.procedure_codes:
+            if cpt in self._by_cpt:
+                import logging
+                logging.warning(f"CPT {cpt} already registered; overwriting with {policy.policy_id}")
             self._by_cpt[cpt] = policy
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/policies/registry.py` around lines 13 - 15, The
register method currently overwrites existing entries in self._by_cpt when two
PolicyDefinition objects share a procedure_codes value; update register(self,
policy: PolicyDefinition) to detect collisions by checking "if cpt in
self._by_cpt" before assignment and emit a warning (using the module logger or
warnings.warn) that includes the CPT value and both the existing policy and the
incoming policy identifiers (e.g., policy.name or repr(policy) and
repr(self._by_cpt[cpt])) so developers can see which policy is being replaced,
then continue to assign self._by_cpt[cpt] = policy.
apps/intelligence/src/models/policy.py (1)

6-14: Consider adding Field constraints for weight.

The comment indicates weight should be 0.0-1.0, but there's no validation. Malformed policy definitions could produce unexpected confidence scores.

Add Field validation
+from pydantic import BaseModel, Field
-from pydantic import BaseModel


 class PolicyCriterion(BaseModel):
     """A single criterion from a coverage policy."""

     id: str
     description: str
-    weight: float  # 0.0-1.0, clinical importance
+    weight: float = Field(ge=0.0, le=1.0, description="Clinical importance weight")
     required: bool = False  # Hard gate — if NOT_MET, caps score
     lcd_section: str | None = None  # e.g. "L34220 §4.2"
     bypasses: list[str] = []  # criterion IDs this one bypasses when MET
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/models/policy.py` around lines 6 - 14, The
PolicyCriterion model lacks validation for weight and uses a mutable default for
bypasses; update the weight field in class PolicyCriterion to enforce 0.0-1.0
bounds using a Pydantic Field (e.g., Field(ge=0.0, le=1.0)) and change bypasses
to use a safe default factory (e.g., list) instead of a literal empty list;
import Field from pydantic if not already present so validation applies at model
init.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/intelligence/src/policies/seed/__init__.py`:
- Around line 11-13: The function register_all_seeds lacks a type annotation for
its registry parameter; update its signature to include the appropriate registry
interface/type (the object expected by register_all_seeds that exposes
register), e.g., annotate registry with the correct protocol or concrete class
used in this module so callers and linters know its type (refer to
register_all_seeds, ALL_SEED_POLICIES, and the registry.register usage to
determine the proper type to import and use).

In `@apps/intelligence/src/reasoning/evidence_extractor.py`:
- Around line 34-41: The dict branch drops lcd_section by setting it to None;
change it to read lcd_section from the dict (e.g., lcd_section =
criterion.get("lcd_section")) so that any provided LCD context is preserved;
update the code around the PolicyCriterion check where criterion_id,
criterion_desc, and lcd_section are set to use criterion.get("id", "unknown"),
criterion.get("description", ""), and criterion.get("lcd_section") respectively
(defaulting to None if absent).

In `@apps/intelligence/src/reasoning/form_generator.py`:
- Line 46: Replace the hardcoded CPT fallback "72148" used when procedure_codes
is empty: in the assignment to procedure_code (variable procedure_code,
referencing policy.procedure_codes) either default to a neutral sentinel like
"Unknown" (procedure_code = policy.procedure_codes[0] if policy.procedure_codes
else "Unknown") or raise/propagate a validation exception (e.g., raise
PolicyValidationError("missing procedure_codes") or ValueError) so callers can
handle invalid policies; if you choose to raise, update any callers of the form
generation path to catch/handle that exception.

In `@apps/intelligence/src/tests/test_pa_form_model.py`:
- Line 2: The test module test_pa_form_model contains an unused top-level import
"import pytest"; remove that import line (or if you intended to use pytest
fixtures/assertions, update the tests to reference pytest) so the module no
longer has an unused import; look for the "import pytest" statement at the top
of test_pa_form_model and delete it.

In `@apps/intelligence/src/tests/test_policy_model.py`:
- Line 2: Remove the unused top-level import "pytest" from the test module (the
import statement that reads "import pytest"), leaving only the necessary imports
for tests in test_policy_model.py; if you expect to use pytest fixtures later,
instead add a TODO comment or reintroduce the import when needed.

In `@apps/intelligence/src/tests/test_policy_registry.py`:
- Around line 46-57: The test test_all_seed_cpts_resolve_to_lcd_policy has a
mismatched docstring ("All 14 seed CPT codes") versus the actual seed_cpts list
(12 items); fix by either updating the docstring to "All 12 seed CPT codes" or
adding the two missing CPT strings to the seed_cpts list so it truly contains 14
entries, ensuring each CPT is validated via registry.resolve and the assert
remains unchanged.

---

Nitpick comments:
In `@apps/intelligence/src/models/policy.py`:
- Around line 6-14: The PolicyCriterion model lacks validation for weight and
uses a mutable default for bypasses; update the weight field in class
PolicyCriterion to enforce 0.0-1.0 bounds using a Pydantic Field (e.g.,
Field(ge=0.0, le=1.0)) and change bypasses to use a safe default factory (e.g.,
list) instead of a literal empty list; import Field from pydantic if not already
present so validation applies at model init.

In `@apps/intelligence/src/policies/registry.py`:
- Around line 13-15: The register method currently overwrites existing entries
in self._by_cpt when two PolicyDefinition objects share a procedure_codes value;
update register(self, policy: PolicyDefinition) to detect collisions by checking
"if cpt in self._by_cpt" before assignment and emit a warning (using the module
logger or warnings.warn) that includes the CPT value and both the existing
policy and the incoming policy identifiers (e.g., policy.name or repr(policy)
and repr(self._by_cpt[cpt])) so developers can see which policy is being
replaced, then continue to assign self._by_cpt[cpt] = policy.

In `@apps/intelligence/src/reasoning/confidence_scorer.py`:
- Around line 17-21: Replace the dataclass ScoreResult with a Pydantic BaseModel
to align with project validation/serialization conventions: convert the class
named ScoreResult to inherit from pydantic.BaseModel, keep the fields score:
float and recommendation: Literal["APPROVE","MANUAL_REVIEW","NEED_INFO"], and
add any needed type validators or field constraints (e.g., ensure score is
within expected range) so serialization and validation behave like other models
in the codebase.

In `@apps/intelligence/src/tests/test_confidence_scorer.py`:
- Around line 8-9: The helper function _make_criterion currently uses parameter
name id which shadows the built-in; rename the parameter to criterion_id (and
update its type hint) and replace all uses inside the function (e.g., the
PolicyCriterion(id=...) argument) and any test calls in this file to use
criterion_id instead of id so the code no longer hides the builtin and remains
clear; ensure the function signature and all references (calls to
_make_criterion) are updated consistently.

Comment on lines +11 to +13
def register_all_seeds(registry) -> None:
for policy in ALL_SEED_POLICIES:
registry.register(policy)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add type annotation for registry parameter.

Per coding guidelines, all functions must have complete type annotations.

🔧 Add type hint
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from src.policies.registry import PolicyRegistry
+
 
-def register_all_seeds(registry) -> None:
+def register_all_seeds(registry: "PolicyRegistry") -> None:
     for policy in ALL_SEED_POLICIES:
         registry.register(policy)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/policies/seed/__init__.py` around lines 11 - 13, The
function register_all_seeds lacks a type annotation for its registry parameter;
update its signature to include the appropriate registry interface/type (the
object expected by register_all_seeds that exposes register), e.g., annotate
registry with the correct protocol or concrete class used in this module so
callers and linters know its type (refer to register_all_seeds,
ALL_SEED_POLICIES, and the registry.register usage to determine the proper type
to import and use).

Comment on lines +34 to +41
if isinstance(criterion, PolicyCriterion):
criterion_id = criterion.id
criterion_desc = criterion.description
lcd_section = criterion.lcd_section
else:
criterion_id = criterion.get("id", "unknown")
criterion_desc = criterion.get("description", "")
lcd_section = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Preserve lcd_section for dict-based criteria.
Currently Line 41 forces lcd_section = None, so LCD context is dropped even if provided in dict form.

🛠️ Suggested fix
-        lcd_section = None
+        lcd_section = criterion.get("lcd_section")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if isinstance(criterion, PolicyCriterion):
criterion_id = criterion.id
criterion_desc = criterion.description
lcd_section = criterion.lcd_section
else:
criterion_id = criterion.get("id", "unknown")
criterion_desc = criterion.get("description", "")
lcd_section = None
if isinstance(criterion, PolicyCriterion):
criterion_id = criterion.id
criterion_desc = criterion.description
lcd_section = criterion.lcd_section
else:
criterion_id = criterion.get("id", "unknown")
criterion_desc = criterion.get("description", "")
lcd_section = criterion.get("lcd_section")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/reasoning/evidence_extractor.py` around lines 34 - 41,
The dict branch drops lcd_section by setting it to None; change it to read
lcd_section from the dict (e.g., lcd_section = criterion.get("lcd_section")) so
that any provided LCD context is preserved; update the code around the
PolicyCriterion check where criterion_id, criterion_desc, and lcd_section are
set to use criterion.get("id", "unknown"), criterion.get("description", ""), and
criterion.get("lcd_section") respectively (defaulting to None if absent).


has_not_met = any(e.status == "NOT_MET" for e in evidence)
has_unclear = any(e.status == "UNCLEAR" for e in evidence)
procedure_code = policy.procedure_codes[0] if policy.procedure_codes else "72148"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid hardcoded CPT fallback in Line 46.
Defaulting to "72148" can mislabel non‑MRI policies when procedure_codes is empty. Prefer "Unknown" or raise/propagate a policy validation error.

🛠️ Suggested fix
-    procedure_code = policy.procedure_codes[0] if policy.procedure_codes else "72148"
+    procedure_code = policy.procedure_codes[0] if policy.procedure_codes else "Unknown"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
procedure_code = policy.procedure_codes[0] if policy.procedure_codes else "72148"
procedure_code = policy.procedure_codes[0] if policy.procedure_codes else "Unknown"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/reasoning/form_generator.py` at line 46, Replace the
hardcoded CPT fallback "72148" used when procedure_codes is empty: in the
assignment to procedure_code (variable procedure_code, referencing
policy.procedure_codes) either default to a neutral sentinel like "Unknown"
(procedure_code = policy.procedure_codes[0] if policy.procedure_codes else
"Unknown") or raise/propagate a validation exception (e.g., raise
PolicyValidationError("missing procedure_codes") or ValueError) so callers can
handle invalid policies; if you choose to raise, update any callers of the form
generation path to catch/handle that exception.

@@ -0,0 +1,60 @@
"""Tests for PAFormResponse model update."""
import pytest
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused import.

pytest is imported but not used in this test module.

🧹 Remove unused import
 """Tests for PAFormResponse model update."""
-import pytest
 from src.models.pa_form import PAFormResponse
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import pytest
"""Tests for PAFormResponse model update."""
from src.models.pa_form import PAFormResponse
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/tests/test_pa_form_model.py` at line 2, The test module
test_pa_form_model contains an unused top-level import "import pytest"; remove
that import line (or if you intended to use pytest fixtures/assertions, update
the tests to reference pytest) so the module no longer has an unused import;
look for the "import pytest" statement at the top of test_pa_form_model and
delete it.

@@ -0,0 +1,75 @@
"""Tests for policy data models."""
import pytest
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused import.

pytest is imported but not used in this test module.

🧹 Remove unused import
 """Tests for policy data models."""
-import pytest
 from src.models.policy import PolicyCriterion, PolicyDefinition
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/tests/test_policy_model.py` at line 2, Remove the
unused top-level import "pytest" from the test module (the import statement that
reads "import pytest"), leaving only the necessary imports for tests in
test_policy_model.py; if you expect to use pytest fixtures later, instead add a
TODO comment or reintroduce the import when needed.

Comment on lines +46 to +57
def test_all_seed_cpts_resolve_to_lcd_policy():
"""All 14 seed CPT codes resolve to LCD-backed policies."""
seed_cpts = [
"72148", "72149", "72158", # MRI Lumbar
"70551", "70552", "70553", # MRI Brain
"27447", # TKA
"97161", "97162", "97163", # Physical Therapy
"62322", "62323", # Epidural Steroid
]
for cpt in seed_cpts:
result = registry.resolve(cpt)
assert result.lcd_reference is not None, f"CPT {cpt} should have LCD reference"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Docstring claims 14 CPTs, but list contains 12.

The docstring states "All 14 seed CPT codes" but the seed_cpts list only contains 12 codes. Either the docstring is stale or two CPTs are missing from the test coverage.

Proposed fix
 def test_all_seed_cpts_resolve_to_lcd_policy():
-    """All 14 seed CPT codes resolve to LCD-backed policies."""
+    """All 12 seed CPT codes resolve to LCD-backed policies."""
     seed_cpts = [
         "72148", "72149", "72158",  # MRI Lumbar
         "70551", "70552", "70553",  # MRI Brain
         "27447",                    # TKA
         "97161", "97162", "97163",  # Physical Therapy
         "62322", "62323",           # Epidural Steroid
     ]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_all_seed_cpts_resolve_to_lcd_policy():
"""All 14 seed CPT codes resolve to LCD-backed policies."""
seed_cpts = [
"72148", "72149", "72158", # MRI Lumbar
"70551", "70552", "70553", # MRI Brain
"27447", # TKA
"97161", "97162", "97163", # Physical Therapy
"62322", "62323", # Epidural Steroid
]
for cpt in seed_cpts:
result = registry.resolve(cpt)
assert result.lcd_reference is not None, f"CPT {cpt} should have LCD reference"
def test_all_seed_cpts_resolve_to_lcd_policy():
"""All 12 seed CPT codes resolve to LCD-backed policies."""
seed_cpts = [
"72148", "72149", "72158", # MRI Lumbar
"70551", "70552", "70553", # MRI Brain
"27447", # TKA
"97161", "97162", "97163", # Physical Therapy
"62322", "62323", # Epidural Steroid
]
for cpt in seed_cpts:
result = registry.resolve(cpt)
assert result.lcd_reference is not None, f"CPT {cpt} should have LCD reference"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/intelligence/src/tests/test_policy_registry.py` around lines 46 - 57,
The test test_all_seed_cpts_resolve_to_lcd_policy has a mismatched docstring
("All 14 seed CPT codes") versus the actual seed_cpts list (12 items); fix by
either updating the docstring to "All 12 seed CPT codes" or adding the two
missing CPT strings to the seed_cpts list so it truly contains 14 entries,
ensuring each CPT is validated via registry.resolve and the assert remains
unchanged.

GenerateIdAsync used string OrderByDescending which returned PA-DEMO-004
as max ID. int.TryParse failed on "DEMO-004", counter reset to 1,
causing duplicate key PA-001. Fix filters to only PA-NNN IDs and finds
true numeric max.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rsalus rsalus force-pushed the confidence-algorithm-v2 branch from d98ede4 to b687927 Compare February 22, 2026 04:13
@rsalus rsalus merged commit 552ce90 into main Feb 22, 2026
3 of 7 checks passed
@github-project-automation github-project-automation bot moved this from Todo to Done in Authscript Demo Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant