Conversation
scripts/publish.py
Outdated
| for cid in coreids: | ||
| match = CORE_PATTERN.match(cid) | ||
| if match: | ||
| numbers.append(int(match.group(1))) |
There was a problem hiding this comment.
Current regex pattern creates only one group. match.group(1) will error. We should update regex to be like: CORE_PATTERN = re.compile(r"^CORE-(\d{6})$")
RamilCDISC
left a comment
There was a problem hiding this comment.
All looks good to me now. The workflow should work as intended.
There was a problem hiding this comment.
There are a few structural things this PR will need to change:
- when assigning the directory name for the published rule, it puts the rule in root. The planned structure at present has Unpublished and Published as we need to house unpublished rules separately from published as we port from editor.
root/
├── Published/
│ ├── Standard_Name/
│ │ ├── CORE-XXXX/
│ │ │ └── rule.yaml
│ │ │ ├── negative/
│ │ │ ├── positive/
│ │ └── CORE-XXXX/
├── Unpublished/
├── mappings/ - I changed my PR to use rule.yml for rule files so that logic is good looking for that name
- I have created mappings to map rule ID to core ID. It acts as a ledger for data
SDTMIG_mapping.csv your publishing script when it has the rule.yml will need to find all applicable standards and their Rule IDs. It will need to add/sort them to each applicable CSV which is named via Standard + '_mappings'. It will also need to grab version. There is a one to many relationship between core ID and rule ID. I also had to add logic for FDA_Business_Rules as that wont be listed as standard but if it has Organization FDA and FB as a Rule ID, I had the script add it to that bucket as well. The status will always be published, CORE ID just needs to use the one found by the algo in the gitaction.
Generate_mappings.py This is the script I used to generate the initial mappings if any of the code helps
- the scan for CORE ids will need to recurse through the Published/ directory in root and account for the standards that are nested inside. It could be easier to do this in mappings/ and use the CSVs which are going to be a source of truth so maybe use them for the CORE ID algo?
- I think the most seamless way for this to work is to have authors work in Unpublished/StandardX directory and have the publish script grab the standard directory they are working in and move the rule from the Unpublished dir to the Published but grabbing the standard name off the directory path to know which Standard in Published/ to put it into.
Workflow that supports ledger csvs to publish and keep track of the rules.
Refreshes csvs, rule folder names accordingly