Skip to content

Resolve rule execution errors#1653

Open
gerrycampion wants to merge 16 commits intomainfrom
1578-resolve-existing-rule-execution-and-rule-evaluation-errors-that-appear-in-the-logs
Open

Resolve rule execution errors#1653
gerrycampion wants to merge 16 commits intomainfrom
1578-resolve-existing-rule-execution-and-rule-evaluation-errors-that-appear-in-the-logs

Conversation

@gerrycampion
Copy link
Collaborator

@gerrycampion gerrycampion commented Mar 6, 2026

  • fix dy operation to return a SKIPPED DomainNotFoundError when DM missing
  • fix domain wildcard replacement to use domain instead of dataset unsplit name
  • simplify args for rule_processor.perform_rule_operations to use DatasetMetadata instead of individual args
  • reraise Operations DomainNotFoundError and KeyError to trickle up as SKIPS
  • Better handling of empty datasets and treat EMPTY_DATASET as a SKIP
  • Change ignored data files from warning to info as this is a pretty normal occurrence in a standard study package
  • Find dataset, class, and dataset variables from the model if dataset is in the model but not in the ig
  • Fix some instances of confusion between domain and dataset
  • Move sdtm-related utils from utils to sdtm_utils
  • Move function from utils to dataset_metadata property
  • replace extract_file_name_from_path_string with built-in os.path.basename
  • fix distinct's call to data service referenced datasets
  • fix relrec merge on columns with differing datatypes
  • fix relrec merge to trigger skip when it results in no records
  • clarified SDTMDatasetMetadata wildcard_replacement variable and fixed wildcard replacement instances
  • fix merge on define xml dataset metadata

To compare the Execution Errors before and after, see these reports:
Report before
Report after

CORE Test Suite Updates

@gerrycampion gerrycampion changed the title Fix dy operation and domain wildcard replacement Resolve rule execution errors Mar 6, 2026
@gerrycampion gerrycampion marked this pull request as ready for review March 13, 2026 22:27
@gerrycampion gerrycampion linked an issue Mar 13, 2026 that may be closed by this pull request
dataset = self.data_service.get_dataset(dataset_meta.filename)
referenced_datasets[dataset_meta.name] = dataset
for dataset_metadata in self.data_service.get_datasets():
dataset = self.data_service.get_dataset(dataset_metadata.name)
Copy link
Collaborator

@SFJohnson24 SFJohnson24 Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this needs to stay as
dataset = self.data_service.get_dataset(dataset_meta.filename)

Looking at the interface--i think it is another case of name and filename getting mixed. It needs the file extension, otherwise it returns an empty dataframe. I tested this on cg0370 and was not getting the correct result as it was not finding the LB dataset referenced in the CO dataset (was returning nothing for the distinct operation)

we should change all the get_datasets parameters in the interface and excel/dummy/local/usdm data services to dataset_path to avoid this

Copy link
Collaborator

@SFJohnson24 SFJohnson24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes to resolve several issues, organize code, and use native functionality/metadata properties over unneeded functions.
PR preserves relrec merge functionality--tested cg0602
Correctly resolves execution error vs. skip status from parent issue

Only found one issue that needs to be addressed. Please see comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Run Status in Rules Report, engine v0.15 resolve existing rule execution and rule evaluation errors that appear in the logs

2 participants