Skip to content

983 handle missing external dictionaries#1611

Merged
gerrycampion merged 10 commits intomainfrom
983-handle-missing-external-dictionaries
Mar 16, 2026
Merged

983 handle missing external dictionaries#1611
gerrycampion merged 10 commits intomainfrom
983-handle-missing-external-dictionaries

Conversation

@RakeshBobba03
Copy link
Collaborator

This PR fixes Issue #983, where CORE would crash with an unhandled MissingDataError if an external dictionary (e.g. MedDRA) could not be read. The change wraps dictionary installation in fill_cache_with_dictionaries with a try/except MissingDataError, logs a warning that the specific dictionary/path could not be loaded, skips installing that dictionary into the cache, and continues validation. As a result, validation runs complete and produce reports, while any rules that later fail due to missing dictionaries are surfaced as EXECUTION_ERROR at the rule level instead of terminating the entire process.

Copy link
Collaborator

@RamilCDISC RamilCDISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be nice to have a unit test for such cases. Could you please add a test? If you can think of some other cases too you can add those too.

@gerrycampion
Copy link
Collaborator

@RakeshBobba03 can you also provide an output report that shows the results.

@RakeshBobba03
Copy link
Collaborator Author

RakeshBobba03 commented Feb 11, 2026

can you also provide an output report that shows the results.

@gerrycampion I ran the same validation twice (before and after the fix) with:

python core.py validate -s sdtmig -v 3.4 --dataset-path tests\resources\datasets\ae.xpt --meddra .\bad_meddra -l info -of json -o output_xxxxxx.json

For the “before” run I redirected stdout/stderr to logs_before.txt; for the “after” run I used logs_after.txt and wrote the report to output_after.json.

logs_before.txt shows the run crashing with an unhandled MissingDataError during dictionary load and no report. logs_after.txt shows a WARNING that MedDRA at .\bad_meddra could not be loaded and will be skipped, then validation continues and finishes normally.

There is no “before” report (run crashed). output_after.json is the full JSON report produced after the fix.

With the fix, rules that fail at run time (including those that need the missing dictionary) are reported with EXECUTION_ERROR for that rule/dataset in the JSON (e.g. “rule evaluation error - operation failed”) instead of the whole run stopping. You can see those entries in output_after.json.

logs_after.txt
logs_before.txt
output_after.json

@gerrycampion
Copy link
Collaborator

output_after.json

@RakeshBobba03
Which rules here are failing because of meddra? I see "rule evaluation error"s but none of them have descriptive enough text to indicate that it is because the external dictionary file could not be found.

@RakeshBobba03
Copy link
Collaborator Author

Which rules here are failing because of meddra? I see "rule evaluation error"s but none of them have descriptive enough text to indicate that it is because the external dictionary file could not be found.

I used that run as an example to show the fix: with a bad MedDRA path the engine no longer crashes, it logs a warning, and it still produces a full report.
The exact scenario (single AE dataset, no DM/SV/TV, no define.xml) wasn’t meant to show MedDRA-specific rule failures. In that example, all the “rule evaluation error - operation failed” entries come from other issues (e.g. missing domains like DM/TV/SV, missing columns, or missing define.xml), not from the missing dictionary. The only place the missing MedDRA dictionary is called out in that run is the WARNING at the start of logs_after.txt.
So the files illustrate “bad path → no crash, warning logged, report generated”; they don’t illustrate which rules would fail specifically because MedDRA wasn’t loaded.

@RamilCDISC
Copy link
Collaborator

All seems okay to me. @gerrycampion could you please let us know if you are okay with the PR too? Then i will do a final validation and approve.

@RakeshBobba03
Copy link
Collaborator Author

RakeshBobba03 commented Mar 12, 2026

Validated the changes end‑to‑end using the ae.xpt file provided and the MedDRA‑dependent rule CG0378.yaml, plus a non‑MedDRA rule (CORE-000351), to show both failure and non‑failure scenarios with a bad MedDRA path.

1. MedDRA‑dependent rule (CG0378) with bad MedDRA path

Command:

python core.py validate -s sdtmig -v 3.4 --dataset-path ae.xpt --meddra .\bad_meddra --local-rules CG0378.yaml -r CDISC.SDTMIG.CG0378 -l info -of json -o cg0378_bad_meddra.json

Key observations:

  • Console log shows the dictionary load warning and that validation continues:

    • WARNING ... External dictionary 'meddra' at '.\bad_meddra' could not be loaded and will be skipped: Necessary meddra files missing
    • Rule execution proceeds and a report is generated.
  • In cg0378_bad_meddra.json:

    • Rules_Report for CDISC.SDTMIG.CG0378 has status: "EXECUTION ERROR".

    • Issue_Summary contains a single entry:

      • "core_id": "CDISC.SDTMIG.CG0378", "message": "rule evaluation error - operation failed", "issues": 1.
    • Issue_Details[0].values now clearly states that the failure is due to the missing MedDRA files:

      "values": "Failed to execute rule operation. Operation: valid_external_dictionary_code, Target: AELLTCD, Domain: AE, Error: Necessary meddra files missing"
      

    This ties the EXECUTION_ERROR directly to the missing external dictionary instead of a generic "rule evaluation error".

2. Non‑MedDRA rule (CORE-000351) with same bad MedDRA path

Ran with '-r CORE-000351. The rule is reported as **SKIPPED** (domain not applicable to AE), with no EXECUTION_ERROR and empty Issue_Summary/Issue_Details`, confirming the engine doesn’t fail and non‑MedDRA rules aren’t incorrectly marked as failed when MedDRA is missing.

Net effect of the fix

  • When an external dictionary (e.g. MedDRA) cannot be loaded, the engine:
    • Logs a clear warning and skips that dictionary.
    • Continues validation instead of crashing.
    • Marks only the affected rules as EXECUTION_ERROR.
    • Includes a descriptive message in the JSON report indicating that the rule failed because the external dictionary files were missing (e.g. "Error: Necessary meddra files missing"), making it obvious which rules are impacted by the missing dictionary versus other causes.

cg0378_bad_meddra.json
core351_bad_meddra.json

@gerrycampion gerrycampion merged commit a242d4a into main Mar 16, 2026
12 checks passed
@gerrycampion gerrycampion deleted the 983-handle-missing-external-dictionaries branch March 16, 2026 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Suggestion: when external dictionary cannot be read, CORE execution should not crash

3 participants