-
Notifications
You must be signed in to change notification settings - Fork 0
Planfile
Nimrod/G uses a simple declarative language to describe the experiments. This description is usually written as a simple script file we call a 'plan file'. There are two sections in this plan file; the first one describes the parameters that your experiment has; the second one describes the task/s that Nimrod/G needs to execute to complete a single instance (or a job) from your experiment. The example below shows a fictional plan file for an experiment investigating wing performance.
parameter aircraft_model files select anyof "A3??.dat" "737-*.dat"
parameter AoA float range from -45 to 45 step 2.5
parameter winglets text select anyof "none" "fence" "blended" "raked"
parameter airspeed integer range from 50 to 600 step 50
parameter turbulence float random from 1 to 2
task main
copy root:${aircraft_model} node:.
copy root:wing_test.zip node:.
exec unzip wing_test.zip
shexec "./run_wing_test.sh ${aircraft_model} ${winglets} ${AoA} ${airspeed} ${turbulence} >> output.${jobindex}"
shexec "zip results.${jobindex} *"
copy node:results.${jobindex}.zip root:.
endtask
This is a somewhat contrived example demonstrating various parameter types but it illustrates the basic functionality of defining parameters for the main task and handling input, execution and output for each of the parameter combinations. The two subsections below explain the 'parameter' and 'task' definitions in the plan file.
- Identifier - A variable identifier -
[a-zA-Z_]([0-9a-zA-Z_])* - Substitution - A block of the form
${identifier}. Is replaced with the value ofidentifier. - String Literal - A C-style quoted string. May contain substitutions. C11 escape sequences are possible.
- Raw Literal - An unquoted block of the form
(Substitution|Identifier|[./<>&?])+. - Literal - Either a string literal or a raw literal.
Parameters define lists of values, constant single values or dynamic values (of various types). A unique combination of parameter values is assigned to each job and each parameter value is bound to a named identifier in the job environment. Nimrod/G experiments usually create a cross-product of all parameters to define the jobs which make up the experiment, we call this a full parameter sweep. Parameters can be passed to the executable(s)/script(s) which run your computation, this is usually done on the command line (like the example above) or your code can read them from the environment. Results from your computation(s) might consist of a single file with a single value or multiple large files. You could use your parameters to name your output files or the $jobindex suffix, $jobindex is a special parameter automatically configured by Nimrod to give each job a unique integer ID with respect to the experiment.
NOTE: The old Nimrod/G also provided an implicit parameter $jobname. This functionality is provided for compatibility purposes only and its use is discouraged.
The syntax for parameter lines in the plan file is:
parameter <name> [label <label>] [<type> <domain>]
- name - The name of the parameter, must be unique.
- label - Not used, provided for syntactic compatibility with old Nimrod.
-
type - The type of the parameter, must be one of
{float, integer, text, files}- Types are only skin-deep, their validity is not actually enforced by Nimrod, they are only used to expand the given domain into a list of values. If not type is specified, this parameter has no values.
- domain - See below.
A single value.
<value>
- value - The value. Must match the type of the parameter.
floatintegertext
Generate a numbers in the range [start, end].
range from <start> to <end> step <number>
range from <start> to <end> points <number>
- If the first form is used, a step value will be used to generate the values, starting at start.
- If the second form is used, number uniformly-spaced points will be generated.
floatinteger
Generate count random numbers in the range [start, end].
random from <start> to <end> [points <count>]
- start - The lower bound of the range.
- end - The upper bound of the range. If this is less than start, no values will be generated.
- count - The number of random points to generate. If omitted, default to one.
floatinteger
A list of values.
anyof <value> [<value> [...]]
- If the type is files, then each value may use glob(7) syntax and will be expanded in the current working directory.
-
value- The value. Must be enclosed in quotes and escaped appropriately.
floatintegertextfiles
task <name>
<command>
<command>
...
...
<command>
endtask
- name - The name of the task. Must be one of the following:
| Name | Context | Required |
|---|---|---|
| main | Node | Yes |
| nodestart | Node | No |
- main - The main attraction! Run by the agents, once per job.
- nodestart - Run once per logical "resource" before any main tasks are executed on the resource. In cluster-based environments, this may only run once. In cloud-based environments, it may be run on each spawned VM.
Set the agent's on-error behaviour for subsequent commands.
onerror <fail|ignore>
- fail - Fail the job.
- ignore - Ignore the execution failure and continue. This is best used for unimportant commands.
Configure stdout/stderr redirection for subsequent commands.
redirect stdout|stderr off
redirect stdout|stderr [append] to <file>
-
file - A literal that contains the path of the file to write to.
- Paths are relative to the run's assigned working directory.
Copy a file from the specified source to the specified destination.
copy [<context>:]<source> [<context>:]<destination>
-
context - The context of the source/destination. Defaults to node.
-
root denotes the root (master) node, i.e. where Nimrod/G is running.
- Paths are relative to the run's assigned working directory.
-
node denotes the agent currently executing the job.
- Paths are relative to the job's temporary working directory.
-
root denotes the root (master) node, i.e. where Nimrod/G is running.
- source - A literal containing the path to the source file/directory. Use POSIX path separators ('/').
- destination - A literal containing the path to the destination file/directory. Use POSIX path separators ('/').
copy root:wing_test.zip .copy ${aircraft_model}-results.txt root:./results
Execute a command on the assigned resource node.
exec <path> [<arg1> [<arg2> [...]]]
- path - A literal that contains the path of the file to execute. If not found, search the system's PATH environment variable.
-
arg{n} - A literal the contains the n'th argument to path.
- This is actually the n+1'th argument if you count
argv[0].
- This is actually the n+1'th argument if you count
- The value of
argv[0]is substituted as the resolved path of the executable.
exec python /path/to/script.py arg1
Execute a command line on the system's default shell. Use this at your own risk.
shexec <command>
- command - The command line to execute. This is passed directly to the system's default shell.
This uses the system's DEFAULT shell. I.e. this might be cmd.exe, powershell.exe, or COMMAND.COM on Windows-based nodes, or /bin/bash, /bin/ash, or even just /bin/sh on *nix nodes.
Execute a command on the assigned resource node, allowing argv[0] specification. Don't search the system's PATH.
This is equivalent to the execl system call on Unix systems.
lexec <path> [<arg0> [<arg1> [...]]]
- path - A literal containing the path of the file to execute. The system's PATH environment variable is NOT searched.
-
arg{n} - A literal the contains the n'th argument to path.
-
arg{n} =
argv[n] - If arg0 is an empty string (given using
""), then it is substituted with the value of path. - It is not possible to specify an empty arg0 string.
-
arg{n} =
lexec /usr/bin/python /usr/bin/python /path/to/script.py arg1lexec /usr/bin/python "" /path/to/script.py arg1
Execute a command on the assigned resource node, allowing argv[0] specification. Search the system's PATH if not found.
This is equivalent to the execlp system call on Unix systems.
lpexec <path> [<arg0> [<arg1> [...]]]
- path - A literal that contains the path of the file to execute. The system's PATH environment variable is searched.
-
arg{n} - A literal the contains the n'th argument to path.
-
arg{n} =
argv[n] - If arg0 is an empty string (given using
""), then it is substituted with the resolved path of the executable. - It is not possible to specify an empty arg0 string.
-
arg{n} =
lpexec python python /path/to/script.py arg1lpexec python "" /path/to/script.py arg1
- When resolving paths, if no extension is specified, the Windows agent will add ".exe" to the path.
- See the lpExtension parameter in SearchPath()
For each variable declaration, an environment variable with be created with the name of the variable. In addition, the Nimrod/G agent exposes some other predefined environment variables for use at your leisure.
| Variable | Description | Parameter | Example |
|---|---|---|---|
| NIMROD_EXPNAME | The name of the current experiment. | exp1 | |
| NIMROD_JOBUUID | The UUID of the current job attempt. | 96c0e1fa-866e-40dd-b948-39cd833640cc | |
| NIMROD_TXURI | The file transfer root URI. | sftp://user@hpc.uni.com/home/user | |
| NIMROD_JOBINDEX | The index of the job. Starts at 1. | $jobindex |
1 |
| NIMROD_VAR_x | The value of the variable (parameter) with the name x. |
$x |
value-x-0 |
| NIMROD_VAR_y | The value of the variable (parameter) with the name y. |
$y |
value-y-0 |