Epoch 'type' selection added with audit#34
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for Epoch type selection in the UI by integrating with the CDISC Library API (SDTM Controlled Terminology codelist C99079). The implementation includes database schema changes, API integration, audit trail enhancements, and UI components for selecting and displaying epoch types.
Key changes:
- Added
typecolumn to theepochtable to store selected epoch type via code_uid reference - Integrated CDISC Library API to fetch epoch type options (C99079 codelist) with caching
- Extended epoch audit functionality to track type changes in before/after snapshots
- Added UI components for epoch type selection and audit trail viewers
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_epoch_type_options.py | Unit tests for epoch type options API fetching, caching, and error handling |
| tests/test_epoch_type_audit.py | Integration tests verifying epoch type persistence in audit trail for create/update/delete operations |
| src/soa_builder/web/utils.py | API client functions for fetching epoch type options and mappings from CDISC Library with TTL caching |
| src/soa_builder/web/templates/edit.html | UI dropdown components for epoch type selection in add/edit forms and audit trail viewers |
| src/soa_builder/web/routers/epochs.py | Updated epoch router handlers to include type in audit snapshots |
| src/soa_builder/web/migrate_database.py | Database migration to add optional type column to epoch table |
| src/soa_builder/web/app.py | Core application logic for epoch type handling, code table persistence, and UI rendering with enriched epoch data |
Comments suppressed due to low confidence (1)
src/soa_builder/web/app.py:1
- Deeply nested lambda expressions make this code difficult to read and maintain. Consider extracting this logic into a helper function that fetches epoch types for the audit snapshot.
from __future__ import annotations
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| <table style="width:100%;font-size:0.75em;border-collapse:collapse;"> | ||
| <tr style="background:#eee;"> | ||
| <th style="text-align:left;padding:4px;">ID</th> | ||
| <th style="text-align:left;padding:4px;">Arm</th> |
There was a problem hiding this comment.
The column header 'Arm' is misleading for epoch audit entries. This should be 'Epoch' since au.arm_id is actually au.epoch_id for epoch_audits. The variable name in the template context appears to be inconsistent with the column header.
| {% for au in epoch_audits %} | ||
| <tr> | ||
| <td style="padding:4px;">{{ au.id }}</td> | ||
| <td style="padding:4px;">{{ au.arm_id }}</td> |
There was a problem hiding this comment.
The column header 'Arm' is misleading for epoch audit entries. This should be 'Epoch' since au.arm_id is actually au.epoch_id for epoch_audits. The variable name in the template context appears to be inconsistent with the column header.
| {% for au in activity_audits %} | ||
| <tr> | ||
| <td style="padding:4px;">{{ au.id }}</td> | ||
| <td style="padding:4px;">{{ au.arm_id }}</td> |
There was a problem hiding this comment.
The variable name au.arm_id is misleading in the activity audit context. Based on the SQL query in app.py line 3158, this should be au.activity_id to match the column being selected from the activity_audit table.
| <td style="padding:4px;">{{ au.arm_id }}</td> | |
| <td style="padding:4px;">{{ au.activity_id }}</td> |
src/soa_builder/web/app.py
Outdated
| activity_audits = [ | ||
| { | ||
| "id": r[0], | ||
| "arm_id": r[1], |
There was a problem hiding this comment.
Incorrect key name for activity audit. The column being selected is activity_id (line 3158), but it's being stored with key 'arm_id'. This should be 'activity_id' to match the database schema.
| "arm_id": r[1], | |
| "activity_id": r[1], |
| epoch_audits = [ | ||
| { | ||
| "id": r[0], | ||
| "epoch_id": r[1], |
There was a problem hiding this comment.
Inconsistent key name in epoch audit. While the database column is correctly epoch_id, the template references this as au.arm_id (line 351 in edit.html), creating a mismatch. Either change this to 'arm_id' or update the template reference.
| "epoch_id": r[1], | |
| "arm_id": r[1], |
src/soa_builder/web/app.py
Outdated
| "types": ( | ||
| lambda: ( | ||
| (lambda rows: [{"id": rid, "type": rtype} for rid, rtype in rows])( | ||
| ( | ||
| lambda conn: ( | ||
| lambda cur: ( | ||
| cur.execute( | ||
| "SELECT id,type FROM epoch WHERE soa_id=? ORDER BY order_index", | ||
| (soa_id,), | ||
| ), | ||
| cur.fetchall(), | ||
| conn.close(), | ||
| )[1] | ||
| )(conn := _connect()) | ||
| ) | ||
| ) | ||
| ) | ||
| )(), |
There was a problem hiding this comment.
Duplicated complex lambda expression from lines 4249-4262. This exact pattern appears twice. Extract into a reusable helper function to reduce duplication and improve maintainability.
| ) | ||
| return HTMLResponse(f"<script>window.location='/ui/soa/{soa_id}/edit';</script>") | ||
| return HTMLResponse( | ||
| f"<script>window.location='/ui/soa/{int(soa_id)}/edit';</script>" |
There was a problem hiding this comment.
Potential XSS vulnerability. While soa_id is expected to be an integer, the string formatting could be exploited if validation fails elsewhere. Use parameterized redirects or proper HTML escaping for all user-controlled values in HTML responses.
| escaped_selection = json.dumps(data_origin_type_submission) | ||
| return HTMLResponse( | ||
| f"<script>alert({escaped_selection});window.location='/ui/soa/{soa_id}/edit';</script>", | ||
| f"<script>alert({escaped_selection});window.location='/ui/soa/{int(soa_id)}/edit';</script>", |
Check warning
Code scanning / CodeQL
Reflected server-side cross-site scripting Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 months ago
To properly fix this vulnerability, the user-provided value reflected in the HTML response (specifically inside the <script>alert(...)</script>) needs to be robustly escaped. FastAPI includes an escape() function via the Flask interface, but in this case, since the input is ultimately used in JavaScript context, you must both:
- Make sure that the user input is safely encoded for JS (as a string literal—this is done by
json.dumps()), - Ensure that the output is also safely embedded in the HTML context. FastAPI's
HTMLResponsedoes not auto-escape payloads; it's better to use a templating engine (like Jinja2, already imported) or at minimum, explicitly escape the input for HTML.
Since the code in question simply returns a snippet of JavaScript that alerts the user input, a good solution is to:
- Use Jinja2's escaping by rendering with a template (using
templates.TemplateResponse) instead of raw string interpolation, - Or, for minimal code changes, use Python's
html.escape()to escape any special HTML characters, in addition to retainingjson.dumpsfor JS safety.
While double-escaping could over-escape, the approach below will sufficiently mitigate XSS. We recommend using html.escape() plus json.dumps() for the inserted value.
Edits are needed in:
src/soa_builder/web/app.py: Importhtml.escapeand use it to escapedata_origin_type_submissionprior to returning in the script block, at line 3753 and 3755.
| @@ -27,6 +27,7 @@ | ||
| from contextlib import asynccontextmanager | ||
| from datetime import datetime, timezone | ||
| from typing import Any, List, Optional | ||
| import html | ||
|
|
||
| import pandas as pd | ||
| import requests | ||
| @@ -3750,9 +3751,10 @@ | ||
| ) | ||
| conn.close() | ||
| # Properly escape the value for safety in HTML/JS context | ||
| escaped_selection = json.dumps(data_origin_type_submission) | ||
| # Escape user input for HTML and JS context before inserting into the page | ||
| safe_selection = json.dumps(html.escape(data_origin_type_submission)) | ||
| return HTMLResponse( | ||
| f"<script>alert({escaped_selection});window.location='/ui/soa/{int(soa_id)}/edit';</script>", | ||
| f"<script>alert({safe_selection});window.location='/ui/soa/{int(soa_id)}/edit';</script>", | ||
| status_code=400, | ||
| ) | ||
| # Create Code_N (continue numbering) |
| conn.close() | ||
| return HTMLResponse( | ||
| f"<script>alert({json.dumps('Unknown Arm Type selection: ' + arm_type_submission)});window.location='/ui/soa/{soa_id}/edit';</script>", | ||
| f"<script>alert({json.dumps('Unknown Arm Type selection: ' + arm_type_submission)});window.location='/ui/soa/{int(soa_id)}/edit';</script>", |
Check warning
Code scanning / CodeQL
Reflected server-side cross-site scripting Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 months ago
The best fix is to escape the user-controlled data before embedding it into any HTML or JavaScript context. Since the offending value is interpolated into a JavaScript alert inside an HTML response, we must ensure it cannot break out of its intended context. The minimal sufficient fix is to use html.escape() (or flask.escape()) before embedding in the HTML/template. However, the value is used as the single argument to alert(), so a robust approach is to always use json.dumps (for JS-string escaping) and then to avoid direct HTML injection by also making sure the value can't inject anything into the surrounding HTML. Alternatively, for full safety, return an error page via a templated response (which autoescapes) rather than hand-crafting the error in a <script> block.
For this codebase, the most straightforward fix (without changing functionality or flow) is to escape arm_type_submission for HTML using html.escape() before interpolating. Also, ensure the error message is encoded as a JS string properly. This requires importing html (a standard library module in Python >=3.2).
Only change the line where the interpolated response is constructed (3882) and add the import html statement if not yet present.
| @@ -27,7 +27,7 @@ | ||
| from contextlib import asynccontextmanager | ||
| from datetime import datetime, timezone | ||
| from typing import Any, List, Optional | ||
|
|
||
| import html | ||
| import pandas as pd | ||
| import requests | ||
| from dotenv import load_dotenv | ||
| @@ -3878,8 +3878,10 @@ | ||
| arm_id, | ||
| ) | ||
| conn.close() | ||
| # Escape the user input before using in JS context | ||
| escaped_arm_type = html.escape(arm_type_submission, quote=True) | ||
| return HTMLResponse( | ||
| f"<script>alert({json.dumps('Unknown Arm Type selection: ' + arm_type_submission)});window.location='/ui/soa/{int(soa_id)}/edit';</script>", | ||
| f"<script>alert({json.dumps('Unknown Arm Type selection: ' + escaped_arm_type)});window.location='/ui/soa/{int(soa_id)}/edit';</script>", | ||
| status_code=400, | ||
| ) | ||
|
|
| conn.close() | ||
| return HTMLResponse( | ||
| f"<script>alert({json.dumps(f'Unknown Data Origin Type selection: {data_origin_type_submission}')});window.location='/ui/soa/{soa_id}/edit';</script>", | ||
| f"<script>alert({json.dumps(f'Unknown Data Origin Type selection: {data_origin_type_submission}')});window.location='/ui/soa/{int(soa_id)}/edit';</script>", |
Check warning
Code scanning / CodeQL
Reflected server-side cross-site scripting Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 3 months ago
The best fix is to robustly escape or validate the user-provided value before reflecting it into the HTML/JS context. Since the string is being put inside a <script>alert(...)</script> returned within an HTMLResponse, we must ensure that the serialized value is properly quoted, and does not introduce a risk of script breaks or HTML injection, even if it is currently safely wrapped. The most reliable approach is:
- Continue using
json.dumpsto serialize the value as a JS string literal in the alert. - Additionally, ensure the value itself is HTML-escaped (so it cannot accidentally break out of JS context if code changes).
- For defense-in-depth, use Flask/FastAPI's built-in escaping for HTML output (e.g.,
html.escapefrom the Python standard library). - Optionally, validate the value against allowed values (whitelist), if possible.
Since only lines within src/soa_builder/web/app.py can be changed, and external packages should be well-known, we will:
- Import
html(for Python 3.2+) and usehtml.escapeto escape the value before embedding. - The fix is to escape
data_origin_type_submissionusinghtml.escapeprior to inclusion in the response. - Make the escaping explicit in the code where the response is constructed on line 3942.
| @@ -24,6 +24,7 @@ | ||
| import urllib.parse | ||
| import tempfile | ||
| import time | ||
| import html | ||
| from contextlib import asynccontextmanager | ||
| from datetime import datetime, timezone | ||
| from typing import Any, List, Optional | ||
| @@ -3938,8 +3939,11 @@ | ||
| arm_id, | ||
| ) | ||
| conn.close() | ||
| # Escape user input for defense-in-depth before embedding | ||
| escaped_submission = html.escape(data_origin_type_submission, quote=True) | ||
| alert_msg = f"Unknown Data Origin Type selection: {escaped_submission}" | ||
| return HTMLResponse( | ||
| f"<script>alert({json.dumps(f'Unknown Data Origin Type selection: {data_origin_type_submission}')});window.location='/ui/soa/{int(soa_id)}/edit';</script>", | ||
| f"<script>alert({json.dumps(alert_msg)});window.location='/ui/soa/{int(soa_id)}/edit';</script>", | ||
| status_code=400, | ||
| ) | ||
| # Maintain/Upsert immutable Code_N for DDF mapping |
Added 'type' column to epoch table to store UI selection of Epoch type.
Epoch type selections are populated from the submissionValue returned from API request using codelist_code=C99079 in the SDTM Controlled Terminology. Uses the latest CT: /mdr/ct/packages/sdtmct-2025-09-26
New code.code_uid created and persisted and linked to epoch.type.
Extended epoch_audit to include type in before_json and after_json.
Added epoch_audit and activity_audit viewers to the edit.html.