Conversation
…ormance improvements included
…ls.py; added caching for API functions
There was a problem hiding this comment.
Pull request overview
This PR is described as adding "CI configuration" but actually includes two significant changes: (1) new CI infrastructure with GitHub Actions workflow and updated pre-commit hooks, and (2) a substantial refactoring of USDM generator utilities into a centralized shared module (usdm_utils.py), plus improvements to the biomedical concept property population logic in app.py.
Changes:
- Added
.github/workflows/ci.ymlfor automated testing/linting across Python 3.10–3.13, and updated.pre-commit-config.yamlwith restructured hooks (black, pytest, flake8). - Consolidated duplicated helper functions from 8+ USDM generator files into a single
src/usdm/usdm_utils.pymodule, eliminating boilerplatetry/except ImportErrorpatterns and standardizing imports tosoa_builder.web.db._connect. - Refactored
_populate_bc_properties_bgand_enrich_biomedical_concept_bginapp.pyto minimize database lock contention during API calls and properly handle connection cleanup.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/ci.yml |
New CI workflow with multi-version Python matrix, ruff + flake8 linting, pytest + coverage |
.pre-commit-config.yaml |
Updated pre-commit hooks: black (24.3.0), flake8 (from PyCQA repo), pytest with always_run |
src/usdm/usdm_utils.py |
New shared utilities module consolidating helpers from all USDM generators |
src/usdm/generate_usdm.py |
Replaced local _nz/_get_soa_metadata with imports from shared modules |
src/usdm/generate_study_timings.py |
Replaced local helpers with imports from usdm_utils |
src/usdm/generate_study_epochs.py |
Replaced local _get_epoch_code_values with import from usdm_utils |
src/usdm/generate_study_cells.py |
Replaced local _get_element_ids with import from usdm_utils |
src/usdm/generate_scheduled_decision_instances.py |
Replaced local helpers with imports from usdm_utils |
src/usdm/generate_scheduled_activity_instances.py |
Replaced local helpers with imports from usdm_utils |
src/usdm/generate_schedule_timelines.py |
Replaced local loader/helper functions with imports from usdm_utils |
src/usdm/generate_encounters.py |
Replaced local helpers with imports; fixed codeSystem prefix, added conditional scheduledAt |
src/usdm/generate_elements.py |
Replaced local transition rule helpers with imports from usdm_utils |
src/usdm/generate_biomedical_concepts.py |
Major rewrite: uses DB-backed properties instead of live API lookups |
src/usdm/generate_arms.py |
Replaced local helpers with imports from usdm_utils |
src/usdm/generate_activities.py |
Replaced local helpers with imports from usdm_utils |
src/soa_builder/web/utils.py |
Removed _get_biomedical_concept_ids (moved to usdm_utils) |
src/soa_builder/web/app.py |
Refactored _populate_bc_properties_bg and _enrich_biomedical_concept_bg for better lock management |
You can also share your feedback on Copilot code review. Take the survey.
| if timing_uid: | ||
| encounter["scheduledAt"] = timing_uid |
There was a problem hiding this comment.
scheduledAt is set unconditionally on line 219 (which may be None when no timing exists), and then lines 227-228 conditionally overwrite it with the same value. This means that when timing_uid is None, the encounter dict will contain "scheduledAt": None, which may not be the intended behavior. If the intent is to only include scheduledAt when a timing UID is available, remove it from the dict literal on line 219 and keep only the conditional assignment on lines 227-228.
| return code_code, code_decode, code_system, code_system_version | ||
|
|
||
|
|
||
| # Helpders for Elements |
There was a problem hiding this comment.
"Helpders" is a typo — should be "Helpers".
| # Helpders for Elements | |
| # Helpers for Elements |
| url = "https://library.cdisc.org/api/mdr/ct/packages/sdtmct-2025-09-26/codelists/C99079" | ||
| headers: dict[str, str] = {"Accept": "application/json"} | ||
| subscription_key = os.environ.get("CDISC_SUBSCRIPTION_KEY") | ||
| api_key = os.environ.get("CDISC_API_KEY") or os.environ.get( | ||
| "CDISC_SUBSCRIPTION_KEY" | ||
| ) | ||
| unified_key = subscription_key or api_key | ||
|
|
||
| if unified_key: | ||
| headers["Ocp-Apim-Subscription-Key"] = unified_key | ||
| if api_key: | ||
| headers["Authorization"] = f"Bearer {api_key}" | ||
| headers["api-key"] = api_key |
There was a problem hiding this comment.
The _get_epoch_code_values function accepts epoch_type and code parameters but the URL is hardcoded to https://library.cdisc.org/api/mdr/ct/packages/sdtmct-2025-09-26/codelists/C99079. While the code parameter is used to search within the returned terms, epoch_type is entirely unused. Also, this function manually constructs API headers instead of using the _build_api_headers() helper defined at the top of this same file (line 15), which would reduce code duplication and ensure consistency.
|
New pre-commit-config and .github/workflows/ci.yaml files are successful. |
|
Closing this pull request. Another was completed and merged this branch with another awaiting merge with master |
Adds new CI configuration for git.
Includes new pre-commit-config