#1274: Add inline compared syntax for N-way set-based output variable comparison#1491
#1274: Add inline compared syntax for N-way set-based output variable comparison#1491RakeshBobba03 wants to merge 40 commits intomainfrom
compared syntax for N-way set-based output variable comparison#1491Conversation
RamilCDISC
left a comment
There was a problem hiding this comment.
I ran a validation using dev editor and updated the rule to have USUBJID also in output variables like following
Outcome:
Message: At least one expected variable is missing from dataset
Output Variables:
- USUBJID
- compared:
- $dataset_variables
- $expected_variables
The AE dataset in attached excel file has USUBJID column but engine returns column not found.
[
{
"executionStatus": "success",
"dataset": "ae.xpt",
"domain": "AE",
"variables": [
"$dataset_variables",
"$expected_variables",
"USUBJID"
],
"message": "At least one expected variable is missing from dataset",
"errors": [
{
"value": {
"$dataset_variables": [
"STUDYID",
"DOMAIN",
"USUBJID",
"AESEQ",
"AELNKID",
"AETERM",
"AELLT",
"AELLTCD",
"AEDECOD",
"AEPTCD",
"AEHLT",
"AEHLTCD",
"AEHLGT",
"AEHLGTCD",
"AEBDSYCD",
"AESOC",
"AESOCCD",
"AESEV",
"AEACN",
"AEREL",
"AEOUT",
"AESCAN",
"AESCONG",
"AESDISAB",
"AESDTH",
"AESHOSP",
"AESLIFE",
"AESOD",
"EPOCH",
"AESTDTC",
"AEENDTC",
"AESTDY",
"AEENDY",
"AEENRTPT",
"AEENTPT"
],
"$expected_variables": [
"AELLT",
"AELLTCD",
"AEPTCD",
"AEHLT",
"AEHLTCD",
"AEHLGT",
"AEHLGTCD",
"AEBODSYS",
"AEBDSYCD",
"AESOC",
"AESOCCD",
"AESER",
"AEACN",
"AEREL",
"AESTDTC",
"AEENDTC"
],
"USUBJID": "Not in dataset"
},
"dataset": "ae.xpt"
}
],
"compare_groups": [
[
"$dataset_variables",
"$expected_variables"
]
]
}
]
…sc-org/cdisc-rules-engine into 1274-Comparison-in-Reporting
…metadata context' for metadata check rules
Thanks for bringing this to my attention @RamilCDISC USUBJID is record-level data, not metadata. When it's included in Output Variables for a Variable Metadata Check rule, the rule operates on a metadata dataset, not the original data records. Since USUBJID is a data column and not part of the metadata structure, it doesn't exist in the metadata dataset. That's why it was showing "Not in dataset". Sam and I agreed that the message "Not in dataset" is misleading in this context, since USUBJID actually exists in the original dataset, it's just not available in the metadata context for Variable Metadata Check rules. We decided to change the error message to "not available in metadata context" for Variable Metadata Check rules and Dataset Metadata Check rules (and their variants) to better reflect what's happening. I updated the error message logic in and I updated the documentation in |
RamilCDISC
left a comment
There was a problem hiding this comment.
If the rule has only one variable in comparison block then the report like
"Output_Variables": [
"variable_name",
{
"compared": [
"$dataset_variables"
]
}
]
the first row in the issue details sheet of report correctly outputs both dataset_variable and variable_name. But for all remaining rows only variable_name is put in the report. Please see attached report.
…sc-org/cdisc-rules-engine into 1274-Comparison-in-Reporting
|
Sorry to throw this in here, but after seeing this PR and giving this some thought, I don't think this should be a change in the reporting mechanism. I think we should just have a new operation like Operations:
- id: $expected_variables
operator: expected_variables
- id: $dataset_variables
operator: get_column_order_from_dataset
- id: $expected_minus_dataset
name: $expected_variables
operator: minus
value: $dataset_variables
Check:
all:
- name: $expected_minus_dataset
operator: non_empty
Outcome:
Message: At least one expected variable is missing from dataset
Output Variables:
- $expected_variables
- $dataset_variables
- $expected_minus_datasetWould this work? |
btw if we go this route, it's probably easiest to just create a new branch |
#1274: Implements explicit inline comparison syntax for output variables in rules via a new compared: dict block within Output Variables, enabling N-way set-based comparisons (len >= 2) where the first variable serves as baseline and subsequent variables are compared against it. The implementation flattens all variables (siblings + compared children) for UI display while isolating comparison logic to only variables within compared blocks, always uses set-based (order-independent) comparison. Reporting now shows formatted comparison summaries (missing/extra items) followed by raw variable lists, with Excel multi-line rendering support.
Attached are the Rule and dataset used for testing:
CORE-000334.yaml
unit-test-coreid-CG0016-negative.xlsx