Skip to content

Incomplete pivot for SUPP merge #1659

@HajimeShimizu

Description

@HajimeShimizu

Feature Description

CORE: v0.14.2

cdisc_rules_engine/utilities
/data_processor.py

I see following code in "def process_supp"

        if "RDOMAIN" in supp_dataset.columns and supp_dataset["RDOMAIN"][0] == "DM":
            excluded_columns = list(supp_dataset["QNAM"].unique()) + columns_to_drop
            group_cols = [c for c in supp_dataset.columns if c not in excluded_columns]
            supp_dataset = PandasDataset(
                supp_dataset.data.groupby(group_cols, dropna=False, as_index=False).agg(
                    lambda x: (x.dropna().iloc[0] if not x.dropna().empty else pd.NA)
                )
            )

Pivot is limited to DM only. Thus, when I specify to merge SUPPAE and AE, following data is given.

  STUDYID DOMAIN USUBJID SPDEVID AESEQ  ... RDOMAIN QORIG QEVAL  TEST ALPHA
0    TEST     AE    A001    None     1  ...      AE  None  None     A  <NA>
1    TEST     AE    A001    None     1  ...      AE  None  None  <NA>     C
2    TEST     AE    A001    None     2  ...      AE  None  None     B  <NA>
3    TEST     AE    A001    None     2  ...      AE  None  None  <NA>     D
4    TEST     AE    A001    None     3  ...     NaN   NaN   NaN   NaN   NaN

I believe we need this operations for all datasets. I suggest deleting "IF" section.

Metadata

Metadata

Assignees

Labels

SDTMCDISC SDTM/SDTMIGquestionFurther information is requested

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions