operator rework by SFJohnson24 · Pull Request #1638 · cdisc-org/cdisc-rules-engine

SFJohnson24 · 2026-02-25T19:58:20Z

this PR does a few things all involving the target_is_sorted_by operator

it restores the null position argument. Nulls are now sorted correctly in this operator. Nulls that appear at the correct end of the sequence — after all non-null values for last, or before all non-null values for first — are considered valid and not flagged.
it implements a 2 pass method for ensuring sort order:
Pass 1 — Positional order check: The group arrives pre-sorted by target. Null comparator values are validated against the na_pos argument (nulls should all be last or all be first) while non-null rows are left in their original target-sorted order. A second copy of the non-null rows is sorted by comparator. The two target sequences are compared positionally — i.e. if SESEQ is correctly ordered by the comparator dates, the sequences should be identical. Any mismatch means a SESEQ/target value is in the wrong position relative to the dates. Rows involved in date overlaps (where two dates have different precisions and are ambiguously ordered, e.g. 2025-10 and 2025-10-08) are excluded from this check entirely since their sort position cannot be determined — both the precise and imprecise date are flagged by date_overlap_check (imprecise dates that overlap sort() to the front of the dates they overlap as they technically fall first lexicographical).
Pass 2 — Neighbor consistency check: Walks through rows sorted by target and for each row that passed Pass 1, compares its comparator value against its nearest non-null neighbors. If ascending, prev <= curr <= next must hold. This guards against incidental correctness — cases where a target value happens to land in the right position despite the surrounding sequence being wrong, which the positional comparison alone would miss. Only rows not already marked invalid are checked, and overlap-excluded rows are also skipped here.

Overlap date rows are never reconsidered-- they are all marked false and skipped by both passes as we cannot infer their order. One test case where an imprecise date that overlaps with 3 other rows is included--the imprecise date and the 3 it overlaps with all are marked as not sorted

to test:
DatasetsN.json
DatasetsN2.json
DatasetsN3.json
Rule_underscores.json

…isc-rules-engine into target_is_sorted_by

RamilCDISC · 2026-02-26T22:19:10Z

cdisc_rules_engine/check_operators/dataframe_operators.py

            sorted_df = self.value[selected_columns].sort_values(
-                by=[*within_columns, comparator],
-                ascending=[True] * len(within_columns) + [ascending],
+                by=[*within_columns, target],


Could you please explain why are we changing the sorting to [*within_columns, target] from [*within_columns, comparator]. What I understand is the operator is supposed to detect "in dataset's current order, do target follow comparator order". but if the code sorts the data by target first then it changes the current order before checking. this can lead to false passes (it says sorted when original data was not)
Please let me know if I am understanding the logic incorrectly.

so technically we want both the target and the comparator to be in order. If we sort by comparator then check the order of target it is the same as sorting by the target and checking the comparator.
CG0662:
Check:
all:
- name: --SEQ
operator: target_is_not_sorted_by
value:
- name: --STDTC
null_position: last
sort_order: asc
within: USUBJID
if we take the group within USUBJID, sort by comparator (--STDTC date), is the sequence in order? We can do this either direction technically--we can sort the sequence in order then check --stdtc order. This is what we are doing-- we are sorting by target (which is always ascending in this kind of check) and then checking comparator is sorted in the order we want (asc/desc based on sort_order) and null position.

RamilCDISC · 2026-02-26T22:21:53Z

cdisc_rules_engine/check_operators/dataframe_operators.py

+        )
        return is_valid

    def check_date_overlaps(self, group, target, comparator):


Could we please add test for new behavior where if there is a overlap with different precision we mark both as invalid.

RamilCDISC · 2026-02-26T22:23:07Z

resources/schema/rule/Operator.md

 ### target_is_sorted_by

-True if the values in `name` are ordered according to the values specified by `value` in ascending/descending order, grouped by the values in `within`. Each `value` requires a variable `name` and an ordering of 'asc' or 'desc' specified by `order`. `within` accepts either a single column or an ordered list of columns. Columns can be either number or Char Dates in ISO8601 'YYYY-MM-DD' format
+True if the values in name are ordered according to the values specified by value in ascending/descending order, grouped by the values in wit hin.Each value entry requires a variable name, a sort_order of asc or desc, and an optional null_position of first or last (defaults to last) which controls where null/empty comparator values are placed in the expected ordering. within accepts either a single column or an ordered list of columns. Columns can be either number or Char Dates in ISO8601 YYYY-MM-DD format. Date values with different precisions that overlap (e.g. 2005-10 and 2005-10-08) are also flagged as not sorted as their order cannot be inferred.


There are some typos in the updated documentation. Could you please fix those?

RamilCDISC

The PR reworks the target_is_not_sorted_by operator. The validation was done by:

Reviewing the PR for any unwanted code or comments.
Reviewing the updated logic in accordance to the AC.
Reviewing the updated documentation for accuracy.
Reviewing the updated tests for proper coverage.
Requested more test for regression cases.
Running manual validation using dev rule editor covering cases like:

nulls in comparator and target
row order scramble
backward date as " seq " increases
mixed precision overlap dates
duplicate comparator dates

SFJohnson24 added 2 commits February 24, 2026 18:12

operator rework

08d141e

Merge branch 'main' into target_is_sorted_by

fc95245

SFJohnson24 temporarily deployed to DEV February 25, 2026 19:58 — with GitHub Actions Inactive

SFJohnson24 requested review from RamilCDISC and gerrycampion February 25, 2026 19:59

gerrycampion and others added 4 commits February 25, 2026 18:48

Merge branch 'main' into target_is_sorted_by

789f4ba

Merge branch 'main' into target_is_sorted_by

f9911a9

numeric check

b5b23d2

Merge branch 'target_is_sorted_by' of https://github.com/cdisc-org/cd…

3f9535c

…isc-rules-engine into target_is_sorted_by

SFJohnson24 linked an issue Feb 26, 2026 that may be closed by this pull request

Rule blocked: CG0418 #1599

Closed

RamilCDISC requested changes Feb 26, 2026

View reviewed changes

SFJohnson24 added 5 commits February 27, 2026 11:42

test update

833d8f2

docs

554fa70

Merge branch 'main' into target_is_sorted_by

cefc4eb

overlap

051ad08

comment

a2bc0c5

SFJohnson24 requested a review from RamilCDISC February 27, 2026 19:21

RamilCDISC temporarily deployed to DEV February 27, 2026 19:59 — with GitHub Actions Inactive

RamilCDISC approved these changes Feb 27, 2026

View reviewed changes

SFJohnson24 merged commit 9d6d7e5 into main Mar 2, 2026
12 checks passed

SFJohnson24 deleted the target_is_sorted_by branch March 2, 2026 17:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

operator rework#1638

operator rework#1638
SFJohnson24 merged 11 commits intomainfrom
target_is_sorted_by

SFJohnson24 commented Feb 25, 2026 •

edited

Loading

Uh oh!

RamilCDISC Feb 26, 2026

Uh oh!

SFJohnson24 Feb 27, 2026

Uh oh!

RamilCDISC Feb 26, 2026

Uh oh!

SFJohnson24 Feb 27, 2026

Uh oh!

RamilCDISC Feb 26, 2026

Uh oh!

SFJohnson24 Feb 27, 2026

Uh oh!

RamilCDISC left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

SFJohnson24 commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RamilCDISC Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

SFJohnson24 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

RamilCDISC Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

SFJohnson24 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

RamilCDISC Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

SFJohnson24 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

RamilCDISC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SFJohnson24 commented Feb 25, 2026 •

edited

Loading