** π Click here to see the live demo β **
A small web app that turns a short primary-care consultation transcript into:
- SOAP note (Subjective, Objective, Assessment, Plan)
- Problem list with brief rationales + up to 3 ICD-10 suggestions
- Billing hint (likely E/M level or short CPT list)
- Compliance banner + simple guardrails (emergency keywords β require user acknowledgement)
- Trace tab showing prompts and decision log
- Frontend: React + Vite + TypeScript, Tailwind CSS, Radix UI components
- Backend: Node.js + Express.js, Zod for validation
- AI: OpenAI API (server-side only, using structured outputs)
- Storage: Stateless (no persistent storage, all processing in-memory)
~14 hours (initial implementation) + improvements
-
Monorepo structure: Using npm workspaces to manage frontend and backend in a single repo for easier development and deployment.
-
Type safety: TypeScript in frontend, JavaScript in backend (with JSDoc comments). Chose this balance for faster backend iteration while maintaining frontend type safety.
-
API design: RESTful API with clear separation between formatting and analysis endpoints. All AI calls happen server-side to protect API keys.
-
Validation: Zod schemas on both request validation and LLM response parsing ensure type safety and catch malformed responses early.
-
Structured outputs: Using OpenAI's structured outputs feature with Zod schemas to ensure consistent JSON responses. This reduces parsing errors and improves reliability.
-
Prompt engineering:
- System prompt emphasizes cautious language and emergency detection
- User prompt keeps transcript in triple-quoted block for clarity
- Model instructed to return ONLY JSON per schema
-
Retry logic: Single retry on LLM failures to handle transient errors without excessive API costs.
-
Guardrails:
- Pre-scan transcript for emergency keywords before LLM call
- Post-process LLM output to soften definitive claims ("will cure" β "may help")
- Require user acknowledgement for high-risk transcripts
-
ICD-10 codes: Model extracts up to 3 codes per problem, ranked by relevance. Confidence levels (low/med/high) help clinicians assess suggestions.
-
E/M levels: Model suggests likely E/M level (99212-99215) based on visit complexity inferred from transcript.
-
CPT codes: Up to 3 CPT codes suggested with one-line justifications. Model uses clinical context to infer appropriate procedure codes.
-
Problem extraction: Model identifies problems from transcript and provides brief rationales (1-2 lines) linking symptoms to diagnoses.
-
API key protection: All OpenAI API calls happen server-side. No keys exposed to browser.
-
PII handling: Transcripts are processed in-memory only. No persistent storage of PHI. Application is fully stateless.
-
Compliance banner: Always displayed, dynamically styled based on risk level. Text emphasizes "draft only" and "not a medical device."
-
Emergency detection: Keyword-based scanning for common emergencies (chest pain, suicidal ideation, etc.). Requires explicit user acknowledgement before showing results.
-
Single-page app: All functionality on one page with tabbed results for better flow.
-
Transcript formatting: Optional pre-processing step to clean rough transcripts (remove timestamps, merge broken lines) before analysis.
-
Copy & export: Each section has copy button; full JSON export available for integration with other systems.
-
Trace tab: Shows exact prompts (with redacted secrets) and decision log for transparency and debugging.
-
Confidence badges: Visual indicators (green/yellow/red) for model confidence levels help clinicians assess suggestion quality.
-
Time limit: Focused on core requirements over stretch goals (RAG, model comparison, PHI scrubbing).
-
Storage: Application is stateless (no database). All processing happens in-memory. Suitable for serverless deployments like Vercel.
-
Error handling: Basic error handling with user-friendly messages. More robust error states could be added.
-
Testing: No unit tests yet. Would add tests for guardrails, validation, and parsing logic in production.
-
Coding accuracy: Relies on LLM for coding suggestions. In production, would integrate with official coding databases and add validation rules.
- Node.js v22.1.0 (see
.nvmrc) - OpenAI API key (get one here)
-
Clone the repository
-
Install dependencies:
npm install
-
Set up environment variables:
# Backend cp server/.env.example server/.env # Add your OPENAI_API_KEY to server/.env
-
Run the application:
npm run dev
This starts both frontend (typically http://localhost:5173) and backend (typically http://localhost:3000) concurrently.
| name | commands |
|---|---|
| install dependencies | npm install |
| run locally | npm run dev |
| run backend only | npm run dev -w server |
| run frontend only | npm run dev -w frontend |
.
βββ frontend/ # React + Vite + TypeScript
β βββ src/
β β βββ app/ # Main App component
β β βββ components/ # UI components
β β βββ hooks/ # Custom React hooks
β β βββ types/ # TypeScript types
β β βββ utils/ # Utility functions
β βββ public/ # Sample transcripts
βββ server/ # Express.js backend
β βββ config/ # Configuration (OpenAI, app settings)
β βββ controllers/ # Request handlers
β βββ services/ # Business logic (LLM, guardrails)
β βββ prompts/ # LLM prompt templates
β βββ schemas/ # Zod validation schemas
β βββ routes/ # API routes
β βββ utils/ # Utility functions
βββ README.md # This file
βββ Research.md # Research sources and takeaways
- Transcript input (paste or upload .txt file)
- Transcript formatting/cleaning
- SOAP note generation
- Problem list with ICD-10 codes (up to 3)
- Billing hints (E/M level + CPT codes)
- Compliance banner (dynamic based on risk)
- Emergency keyword detection & acknowledgement
- Claims softening (definitive β cautious language)
- Trace tab (prompts + decision log)
- Copy buttons for each section
- JSON export
- Confidence level display (low/med/high badges)
- Session/history persistence (would require external database/storage service)
- Unit tests (especially guardrails and validation)
- RAG with ICD-10 CSV for better coding suggestions
- Model comparison feature
- PHI scrubbing with mapping table
- More robust error states and retry UI
-
Transcript format: Assumes transcripts are text-based, may contain timestamps or rough formatting. Formatting step handles cleanup.
-
Primary care focus: Optimized for primary care consultations. May need adjustments for specialty care.
-
Short transcripts: Designed for transcripts β€ 5KB. Longer transcripts may need chunking.
-
English only: Currently supports English transcripts only.
-
Single consultation: Each analysis is independent. No multi-visit tracking.
-
Clinician review: All outputs are drafts requiring clinician review. Not intended for direct billing use.
ISC