Skip to content

#1442 fix dataset filtering when -dp is provided#1645

Merged
SFJohnson24 merged 4 commits intomainfrom
1442-fix-dataset-filetype-filtering
Mar 4, 2026
Merged

#1442 fix dataset filtering when -dp is provided#1645
SFJohnson24 merged 4 commits intomainfrom
1442-fix-dataset-filetype-filtering

Conversation

@alexfurmenkov
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@RamilCDISC RamilCDISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cases when -dp points to a file that is not mentioned by -ft then the message should state the reason why no files were processed.

Right now if I mention only one dataset file using -dp and use -ft to point to a different type of file. The message is unclear. The excel file mentioned with -dp flag is valid and does exist in the system. The reason it was not used because -ft flag was specified to use only json files.

python core.py validate -s sdtmig -v 3.4 -dp tests/resources/report_test_data/test_datasets.xlsx -ft json
[ERROR 2026-03-02 14:29:57,900 - core.py:184] - No valid dataset files provided.
Supported formats: SAS V5 XPT, Dataset-JSON (JSON or NDJSON), or Excel (XLSX)

README.md Outdated
-jcf, --jsonata-custom-functions Pair containing a variable name and a Path to directory containing a set of custom JSONata functions. Can be specified multiple times
-e, --encoding TEXT File encoding for reading datasets. If not specified, defaults to utf-8. Supported encodings: utf-8, utf-16, utf-32, cp1252, latin-1, etc.
-e, --encoding TEXT File encoding for reading datasets. If not specified, defaults to utf-8. Supported encodings: utf-8, utf-16, utf-32, cp1252, latin-1, etc.
-ft, --filetype TEXT File extension to filter datasets. Has higher priority then --dataset-path parameter.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small typo. I believe it should be higher priority 'than' instead of 'then' --dataset-path parameter

Copy link
Collaborator

@RamilCDISC RamilCDISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR fixes the dataset filtering using -ft parameter when dataset is specified using -dp parameter. The validation was done by:

  1. Review the PR for any unwanted code or comments.
  2. Review the PR in accordance with AC.
  3. Ensuring all unit and regression testing pass.
  4. Ensuring relevant testing is updated.
  5. Running manual validation using CLI covering cases such as:
  • file with -dp and same -ft extension
  • file with -dp with different 0ft extension
  • -ft extension with multiple files in a folder specified by -d
  • missing file specified with -dp where -ft is of same extension
  • missing file specified with -dp where -ft is of different extension

Copy link
Collaborator

@SFJohnson24 SFJohnson24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR correctly addresses issue

@SFJohnson24 SFJohnson24 merged commit ad638dd into main Mar 4, 2026
10 of 11 checks passed
@SFJohnson24 SFJohnson24 deleted the 1442-fix-dataset-filetype-filtering branch March 4, 2026 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request to add a CLI option to filter the filetype of files to process

3 participants