-
Notifications
You must be signed in to change notification settings - Fork 12
best practices guide #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
a-frantz
wants to merge
2
commits into
openwdl:main
Choose a base branch
from
a-frantz:best-practices
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| # WDL Best Practices | ||
a-frantz marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| This guide is includes best practices that should be followed while authoring WDL, regardless of WDL version or backend or engine. | ||
|
|
||
| ## Shell Safety | ||
|
|
||
| - All tasks with multiple commands (including any pipes (`|`)) should have `set -euo pipefail` before any other commands. | ||
| - Tasks without multiple commands or pipes can omit this. | ||
| - These options will cause common classes of bugs in Bash scripts to fail immediately and loudly, instead of causing silent or subtle bugs in your task behavior. | ||
|
|
||
| ## Resource Management | ||
|
|
||
| - If the _contents_ of a `File` are not read or do not need to be localized for a task, try to coerce the `File` variable to a `Boolean` (with `defined()`) or a `String` (with `basename()`) to avoid unnecessary disk space usage and networking. | ||
| - Most tasks should have a default `maxRetries` of 1. | ||
| - This is because many WDL backends are prone to intermittent failures that can be recovered from with a retry. | ||
| - Certain tasks are especially prone to intermittent failure (often if any networking is involved) and can have a higher default `maxRetries`. | ||
| - Certain tasks with potentially high compute costs in cloud environments may default to `0`. This should be used in combination with call caching to aid rerunning while minimizing costs. | ||
| - Tasks should have easily configurable memory and disk space allocations. | ||
| - Often, tasks have a dynamic calculation for resource requirements based on input sizes. Users of a WDL should have an easy way to fine tune this calculation. | ||
| - This may mean incorporating an `Int` or `Float` in the inputs of the task that is applied to the dynamic calculation. | ||
| - For WDL 1.3 and later, WDL authors can change resource requirements between retry attempts. This enables mitigation of errors relating to resources limits, but users may inadvertently disable these mitigations by introducing runtime overrides. | ||
| - WDL authors should expose resource fine tuning via the input section and incorporate those user values in any dynamic calculations as an alternative to overriding the requirements with static values. | ||
| - Check all assumptions made about workflow inputs before beginning long running or expensive executions. | ||
| - Common examples of assumptions that should be checked: | ||
| - mutually exclusive parameters | ||
| - missing optional file for selected parameters | ||
| - filename extensions | ||
| - valid `String` choice | ||
| - For WDL 1.3 and later, an `enum` should be used in place of `String`s with a fixed set of valid options | ||
| - Use `after` clauses in workflows to ensure that all these assumptions are valid before beginning tasks with heavy computation. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
|
|
||
| ## Reproducibility | ||
|
|
||
| - All tasks should run in a persistently versioned container. | ||
| - e.g. do not use `latest` tags for Docker images. | ||
| - This helps ensure reproducibility across time and environments. | ||
|
|
||
| ## File Management | ||
|
|
||
| - Tasks which assume a file and any accessory files (e.g. a BAM and a BAI) have specific extensions and/or are in the same directory should *always* create symlinks from the mounted inputs to the work directory of the task | ||
| - This is because individual `File` types are not guaranteed to be in the same mounted directory. | ||
| - The `command` may include something like: `ln -s "~{<input name>}" "./<expected name>"` | ||
| - Tasks should `rm` any temporary or intermediate files created in the work directory (including symlinks). | ||
| - This helps reduce disk bloat from keeping unnecessary files around. | ||
| - This is especially important for any large or uncompressed files, such as reference FASTAs or databases. | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure "Getting Started" is the right home for this — would you mind moving it to a new top-level "Guides" section between Design Patterns and Reference?