Clinical Notes Summarizer

Project Overview

A healthcare AI solution that transforms incomprehensible medical summaries into patient-friendly "fridge magnet" summaries using a hybrid structured + AI approach. This addresses the real problem where patients can't understand their own medical information.

🆕 NOW SUPPORTING BOTH FHIR R4 JSON AND CCDA XML with identical safety guarantees!

🏥 The Problem

Patients receive complex medical summaries filled with jargon
Critical medication information is buried in technical language
Healthcare literacy barriers prevent proper self-care
Patients struggle to follow treatment plans they can't understand

💡 The Solution

Hybrid Structured + AI Processing with Dual Format Support:

PRESERVE EXACTLY: Medications, dosages, lab values, appointments (zero AI modification)
AI-ENHANCE: Narrative explanations, procedure descriptions, care instructions
"Fridge Magnet" Format: Scannable, mobile-friendly, printable summaries
Dual Input: FHIR R4 JSON and CCDA XML processing with identical safety guarantees
Enhanced Security: XML security validation (XXE/DTD protection) for CCDA documents

🤖 Claude Code Sub-Agent Architecture

This project demonstrates advanced use of Claude Code specialized sub-agents for healthcare software development, showcasing role-based AI collaboration.

Agent Personas

We created specialized Claude Code sub-agents with healthcare domain expertise, defined in the /agents/ directory:

Healthcare Product Owner (`healthcare-product-owner`)

Based on agents/product-owner.md

# Specialized for clinical workflows and patient safety
Task(description="Define user stories", 
     prompt="Create user stories for medication summary feature",
     subagent_type="healthcare-product-owner")

Domain Expertise:

Clinical Workflows: EHR systems (EPIC, NextGen, Cerner) integration patterns
Patient Experience: Health literacy and communication clarity focus
Regulatory Knowledge: HIPAA, FDA guidelines, patient safety requirements
Data Standards: FHIR, HL7, clinical terminology (SNOMED, ICD-10)
Quality Metrics: Patient satisfaction and clinical outcome tracking

Key Deliverables:

User stories with healthcare-specific acceptance criteria
Feature prioritization framework (P0: Safety, P1: Core features, P2: Enhancement)
Clinical scenario validation and risk management strategies
EHR integration planning and stakeholder communication

Healthcare Software Engineer (`healthcare-swe`)

Based on agents/swe.md

# Specialized for medical software with TDD approach
Task(description="Implement TDD framework", 
     prompt="Build FHIR-compliant medication processor with safety validation",
     subagent_type="healthcare-swe")

Technical Expertise:

Core Technologies: Python 3.9+, FastAPI, Pydantic, Hugging Face Transformers
Healthcare Standards: FHIR R4, medical data processing, clinical terminology
AI/ML Integration: BART models for text summarization with medical accuracy
Safety Architecture: Fail-safe design, PHI protection, audit logging

Implementation Areas:

Hybrid Clinical Processor: Structured data preservation + AI narrative enhancement
FHIR Data Processing: R4 resource parsing with validation and error handling
Safety Testing Framework: Healthcare-specific test scenarios and validation
Security Patterns: Input sanitization, PHI protection, regulatory compliance

Code Quality Standards:

Healthcare-specific code review checklist (safety, accuracy, privacy)
Type safety with Pydantic medical data models
Comprehensive error handling for clinical processing failures
Documentation standards explaining medical decisions and safety rationale

Sub-Agent Collaboration Workflow

graph TD
    A[Healthcare Product Owner] -->|User Stories & Requirements| B[Healthcare SWE]
    B -->|Implementation & Tests| C[Clinical Validation]
    C -->|Feedback| A
    A -->|Refined Requirements| B
    B -->|Production Code| D[Healthcare Application]

🔬 Development Approach: Healthcare TDD

Test-Driven Development with Clinical Safety

Red: Write failing tests based on clinical requirements
Green: Implement minimal code to pass safety tests
Refactor: Improve code while maintaining healthcare compliance
Validate: Test with real clinical scenarios

Current Test Results

source venv/bin/activate
python3 -m pytest tests/ -v

# Results: 92/99 tests passing
# ✅ Medication safety tests: 100% pass rate
# ✅ Emergency protocols: Clinically validated
# ✅ FHIR compliance: R4 standard adherence

🏗️ Architecture

Hybrid Processing Engine

class HybridClinicalProcessor:
    """
    Core processor implementing hybrid structured + AI approach
    - Preserves exact: medications, labs, vitals, appointments
    - AI-enhances: narratives, explanations, instructions
    """
    
    def process_clinical_note(self, fhir_data: dict) -> PatientSummary:
        # Extract structured data (no AI processing)
        structured = self._extract_structured_data(fhir_data)
        
        # Extract narratives for AI enhancement
        narratives = self._extract_narrative_sections(fhir_data)
        
        # Apply AI only to narrative sections
        enhanced_narratives = self._enhance_narratives(narratives)
        
        # Combine for patient-friendly output
        return self._format_patient_summary(structured, enhanced_narratives)

Safety-First Data Models

class MedicationRequest(BaseModel):
    """FHIR-compliant medication with zero-tolerance validation"""
    medication_name: str = Field(..., min_length=1)
    dosage_amount: str = Field(..., regex=r'^[\d\.]+ \w+$')
    frequency: MedicationFrequency
    
    @validator('medication_name')
    def preserve_exact_name(cls, v):
        # Critical: medication names must never be modified
        return v.strip()

🧪 Clinical Test Scenarios

Our comprehensive test suite covers real-world healthcare scenarios:

Diabetes Management

def test_complex_insulin_regimen_processing():
    """Test realistic diabetes medication regimen"""
    # Basal-bolus insulin with carb ratios and sliding scale
    # Validates: exact dosing preservation, timing instructions

Emergency Protocols

def test_epipen_adult_emergency_protocol():
    """Test anaphylaxis emergency protocol"""
    # EpiPen 0.3mg administration, call 911, follow-up care
    # Validates: life-critical instruction preservation

Medication Interactions

def test_warfarin_antibiotic_interaction_scenario():
    """Test dangerous drug interaction detection"""
    # Warfarin + azithromycin bleeding risk
    # Validates: safety alert generation, clinical decision support

📊 Success Metrics

Patient Safety (Zero Tolerance)

Medication Errors: 0 incidents of incorrect dosage/frequency/name
Lab Value Errors: 0 incidents of incorrect numbers/units/ranges
Emergency Protocol Errors: 0 incidents of incorrect instructions

Clinical Accuracy (>90% Target)

Data Preservation: >95% of structured data fields preserved exactly
Medical Term Accuracy: >90% of terminology correctly explained
Treatment Plan Integrity: >90% of care instructions remain actionable

Patient Comprehension (50% Improvement)

Reading Grade Level: Reduce from 12+ to 6-8 grade level
Medical Jargon: <10% unexplained medical terms in output
Scan-ability: Key actions identifiable in <30 seconds

🚀 Getting Started

Quick Start - Docker Deployment (Recommended)

# Clone repository
git clone <repository-url>
cd clinical-notes-summarizer

# Production deployment with Docker
docker-compose up -d

# Verify deployment
curl -f http://localhost:8000/api/v1/health/startup

# Access API documentation
open http://localhost:8000/docs

Development Setup

# Python 3.9+ required
python3 --version

# CRITICAL: Always use virtual environment (see CLAUDE.md)
python3 -m venv venv
source venv/bin/activate  # macOS/Linux
# or
venv\Scripts\activate     # Windows

# Install dependencies (with venv activated)
pip3 install -r requirements.txt

# Run comprehensive test suite
python3 -m pytest tests/ -v

# Expected: 90+ tests passing with healthcare scenarios

Docker Options

# Development environment with hot reload
docker-compose -f docker-compose.yml -f docker-compose.dev.yml up -d

# Production with monitoring
docker-compose -f docker-compose.yml up -d

# Security scan before deployment
./docker/security/security-scan.sh

Usage Example

from src.summarizer.hybrid_processor import HybridClinicalProcessor
from src.summarizer.fhir_parser import FHIRMedicationParser

# Initialize processor
processor = HybridClinicalProcessor()
parser = FHIRMedicationParser()

# Process FHIR clinical document
fhir_document = {...}  # FHIR R4 Bundle
clinical_data = parser.parse_fhir_document(fhir_document)
summary = processor.process_clinical_data(clinical_data)

# Generate patient-friendly "fridge magnet" summary
print(summary.format_for_patient())

🎯 Claude Code Sub-Agent Demonstration

This project showcases advanced Claude Code capabilities:

1. Role-Specific Expertise

Domain Knowledge: Healthcare-specific agents with clinical expertise
Technical Specialization: Medical software development patterns
Compliance Awareness: HIPAA, FDA guidelines, patient safety requirements

2. Collaborative Development

Requirements Definition: Product Owner creates clinically-validated user stories
Implementation: SWE agent builds with healthcare TDD approach
Quality Assurance: Integrated testing with clinical scenario validation

3. Safety-Critical Development

Zero-Tolerance Testing: Medical data preservation requirements
Regulatory Compliance: Healthcare-specific validation patterns
Clinical Authenticity: Real-world medical scenarios and protocols

4. Agent Orchestration Example

# 1. Product Owner defines requirements
healthcare_po_result = Task(
    description="Define medication safety requirements",
    prompt="Create acceptance criteria for zero-tolerance medication preservation",
    subagent_type="healthcare-product-owner"
)

# 2. SWE implements with TDD
healthcare_swe_result = Task(
    description="Implement medication processor", 
    prompt="Build FHIR medication parser with exact data preservation",
    subagent_type="healthcare-swe"
)

# 3. Product Owner validates clinical authenticity
validation_result = Task(
    description="Validate clinical scenarios",
    prompt="Review test scenarios for clinical realism and practical applicability", 
    subagent_type="healthcare-product-owner"
)

🏥 EHR Integration

Designed for seamless integration with major Electronic Health Record systems:

EPIC MyChart: FHIR R4 compatibility for patient portal integration
NextGen: Clinical workflow optimization for provider efficiency
Cerner PowerChart: Structured data exchange patterns
Generic REST API: Universal integration capability

🔒 Privacy & Security

No PHI Storage: Secure processing without data retention
HIPAA Compliance: Privacy by design architecture
Container Security: Non-root execution, read-only filesystem, vulnerability scanning
PHI Protection: Multi-layer filtering and sanitization in logs and processing
Audit Trails: Complete processing metadata without patient data
Input Validation: Comprehensive sanitization and error handling
Healthcare Compliance: Automated security scanning with healthcare-specific checks

📋 Development Status

✅ Completed (Phase 1)

Hybrid clinical processor with structured data extraction
FHIR R4 medication parser with exact preservation
Comprehensive TDD test suite (92+ passing tests)
Healthcare-specific safety validation framework
Clinical scenario testing (diabetes, cardiac, emergency protocols)
AI narrative enhancement module (BART integration)
FastAPI endpoints with FHIR compatibility
Patient-friendly output formatter ("fridge magnet" format)

✅ Completed (Phase 2)

Docker deployment with comprehensive health checks
Healthcare-compliant container orchestration (Docker Compose)
Production security scanning and compliance validation
Comprehensive logging with PHI protection (Fluent Bit)
Multi-environment deployment support (dev/staging/production)

🚧 In Progress (Phase 3)

Web interface for testing and demonstration

📋 Planned (Phase 4)

Advanced drug interaction detection
Multi-language patient summary support
EHR integration examples and documentation

🔮 Future Considerations (Patient Comprehension Validation)

Based on Healthcare Product Owner validation of MVP formatter implementation

Patient Experience Enhancement (High Priority)

Readability Scoring System: Implement automated readability assessment for AI-enhanced content sections to ensure appropriate health literacy levels
Clinical Review Workflow: Create mandatory clinical review process to validate formatter output before patient delivery
Patient Comprehension Testing Framework: Develop measurable criteria and testing protocols to validate actual patient understanding

Real-World Validation (Medium Priority)

Patient Focus Groups: Conduct structured testing with diverse health literacy levels to validate comprehension effectiveness
Provider Training Materials: Create comprehensive training resources for clinical workflow integration and optimal usage patterns
Patient Outcome Measurement: Design systematic tracking of comprehension effectiveness and health outcomes improvement

Clinical Integration Considerations

EHR Workflow Integration: Seamless embedding into existing clinical documentation and patient communication workflows
Provider Adoption Support: Training protocols and best practices for healthcare teams implementing patient-friendly summaries
Quality Assurance Metrics: Ongoing monitoring of patient comprehension rates and clinical effectiveness

Accessibility & Inclusion

Diverse Population Testing: Validation across different cultural backgrounds, languages, and health literacy levels
Age-Specific Adaptations: Specialized formatting for pediatric, adult, and geriatric patient populations
Technology Access Considerations: Ensure functionality across various devices and technical capabilities

Note: These recommendations ensure the MVP formatter evolves into a clinically-validated solution that demonstrably improves patient health outcomes through better comprehension of medical information.

🐳 Production Deployment

Healthcare-Compliant Docker Deployment

Full production deployment with monitoring, security scanning, and healthcare compliance:

Comprehensive Documentation: See DOCKER_DEPLOYMENT.md for complete deployment guide
Security Validation: Automated vulnerability scanning with healthcare-specific checks
Multi-Environment Support: Development, staging, and production configurations
Health Monitoring: Kubernetes-ready health check endpoints
Compliance Logging: PHI-protected audit trails with Fluent Bit integration
Resource Management: Proper CPU/memory limits for healthcare workloads

Quick Production Start

# Security scan and deploy
./docker/security/security-scan.sh
docker-compose up -d

# Monitor health
curl http://localhost:8000/api/v1/health

# Test CCDA processing
curl -X POST http://localhost:8000/api/v1/ccda/validate \
  -F "ccda_file=@sample_ccda.xml"

curl -X POST http://localhost:8000/api/v1/ccda/summarize \
  -F "ccda_file=@sample_ccda.xml"

🆕 CCDA Document Processing

Supported CCDA Features

Security Validation: XXE/DTD attack protection
Standard Compliance: CCDA R2.1 template support
Section Processing: Medications, labs, vitals, allergies
Data Preservation: Same zero-tolerance guarantees as FHIR
Format Detection: Automatic routing for XML vs JSON input

CCDA API Endpoints

# Validate CCDA document structure and security
POST /api/v1/ccda/validate

# Generate patient-friendly summary from CCDA
POST /api/v1/ccda/summarize

# Get HTML "fridge magnet" format
POST /api/v1/ccda/summarize/html

# CCDA service health check
GET /api/v1/ccda/health

🤝 Contributing

This project demonstrates best practices for healthcare AI development using Claude Code sub-agents. Contributions should maintain:

Clinical Authenticity: All scenarios reviewed by healthcare professionals
Patient Safety: Zero tolerance for medical information errors
Regulatory Compliance: HIPAA, FDA guidelines adherence
Test Coverage: Comprehensive healthcare-specific testing
Security Standards: Container security and PHI protection requirements

📄 License

MIT License - Released for maximum healthcare benefit and community collaboration.

🎯 Key Takeaways for Claude Code Users

This repository demonstrates:

Advanced Sub-Agent Usage: Role-specific healthcare expertise with specialized agents
Domain-Specific Development: Medical software with clinical validation and safety-first architecture
Safety-Critical Systems: Zero-tolerance testing and validation for healthcare environments
Collaborative AI Development: Product Owner + SWE agent workflow with healthcare domain expertise
Production-Ready Deployment: Healthcare-compliant Docker containerization with security scanning
Real-World Application: Complete healthcare software lifecycle from TDD to production deployment

Perfect for learning: Healthcare AI, FHIR integration, TDD with domain experts, Docker deployment, healthcare security compliance, and advanced Claude Code sub-agent orchestration.

Healthcare lives depend on accuracy. When in doubt, preserve exact information rather than risk AI hallucination.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.claude/agents		.claude/agents
agents		agents
docker		docker
docs		docs
src		src
tests		tests
.gitignore		.gitignore
DOCKER_DEPLOYMENT.md		DOCKER_DEPLOYMENT.md
Dockerfile		Dockerfile
QUICKSTART.md		QUICKSTART.md
README.md		README.md
claude.md		claude.md
demo_translation_architecture.py		demo_translation_architecture.py
docker-compose.yml		docker-compose.yml
eb1e7359de4bb1a6543227d125406046d46a48e678a2f057a8f78807f9f87307.png		eb1e7359de4bb1a6543227d125406046d46a48e678a2f057a8f78807f9f87307.png
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_api.py		run_api.py
sample_ccda.xml		sample_ccda.xml
sample_mockup.png		sample_mockup.png
test_translation_demo.py		test_translation_demo.py

codekunoichi/clinical-notes-summarizer

Folders and files

Latest commit

History

Repository files navigation