Skip to content

Planfile

Zane van Iperen edited this page Jan 21, 2021 · 10 revisions

Nimrod Plan File

Nimrod/G uses a simple declarative language to describe the experiments. This description is usually written as a simple script file we call a 'plan file'. There are two sections in this plan file; the first one describes the parameters that your experiment has; the second one describes the task/s that Nimrod/G needs to execute to complete a single instance (or a job) from your experiment. The example below shows a fictional plan file for an experiment investigating wing performance.

parameter aircraft_model files select anyof "A3??.dat" "737-*.dat"
parameter AoA float range from -45 to 45 step 2.5
parameter winglets text select anyof "none" "fence" "blended" "raked"
parameter airspeed integer range from 50 to 600 step 50
parameter turbulence float random from 1 to 2

task main
	copy root:${aircraft_model} node:.
	copy root:wing_test.zip node:.
	exec unzip wing_test.zip
	shexec "./run_wing_test.sh ${aircraft_model} ${winglets} ${AoA} ${airspeed} ${turbulence} >> output.${jobindex}"
	shexec "zip results.${jobindex} *"
	copy node:results.${jobindex}.zip root:.
endtask

This is a somewhat contrived example demonstrating various parameter types but it illustrates the basic functionality of defining parameters for the main task and handling input, execution and output for each of the parameter combinations. The two subsections below explain the 'parameter' and 'task' definitions in the plan file.

Required Terminology

  • Identifier - A variable identifier - [a-zA-Z_]([0-9a-zA-Z_])*
  • Substitution - A block of the form ${identifier}. Is replaced with the value of identifier.
  • String Literal - A C-style quoted string. May contain substitutions. C11 escape sequences are possible.
  • Raw Literal - An unquoted block of the form (Substitution|Identifier|[./<>&?])+.
  • Literal - Either a string literal or a raw literal.

Parameters

Parameters define lists of values, constant single values or dynamic values (of various types). A unique combination of parameter values is assigned to each job and each parameter value is bound to a named identifier in the job environment. Nimrod/G experiments usually create a cross-product of all parameters to define the jobs which make up the experiment, we call this a full parameter sweep. Parameters can be passed to the executable(s)/script(s) which run your computation, this is usually done on the command line (like the example above) or your code can read them from the environment. Results from your computation(s) might consist of a single file with a single value or multiple large files. You could use your parameters to name your output files or the $jobindex suffix, $jobindex is a special parameter automatically configured by Nimrod to give each job a unique integer ID with respect to the experiment.

NOTE: The old Nimrod/G also provided an implicit parameter $jobname. This functionality is provided for compatibility purposes only and its use is discouraged.

The syntax for parameter lines in the plan file is:

parameter <name> [label <label>] [<type> <domain>]
  • name - The name of the parameter, must be unique.
  • label - Not used, provided for syntactic compatibility with old Nimrod.
  • type - The type of the parameter, must be one of {float, integer, text, files}
    • Types are only skin-deep, their validity is not actually enforced by Nimrod, they are only used to expand the given domain into a list of values. If not type is specified, this parameter has no values.
  • domain - See below.

Domains

Default

A single value.

Syntax

<value>

Parameters

  • value - The value. Must match the type of the parameter.

Valid Types

  • float
  • integer
  • text

Range

Generate a numbers in the range [start, end].

range from <start> to <end> step <number>
range from <start> to <end> points <number>
  • If the first form is used, a step value will be used to generate the values, starting at start.
  • If the second form is used, number uniformly-spaced points will be generated.

Valid Types

  • float
  • integer

Random

Generate count random numbers in the range [start, end].

Syntax

random from <start> to <end> [points <count>]

Parameters

  • start - The lower bound of the range.
  • end - The upper bound of the range. If this is less than start, no values will be generated.
  • count - The number of random points to generate. If omitted, default to one.

Valid Types

  • float
  • integer

Anyof

A list of values.

anyof <value> [<value> [...]]
  • If the type is files, then each value may use glob(7) syntax and will be expanded in the current working directory.

Parameters

  • value - The value. Must be enclosed in quotes and escaped appropriately.

Valid Types

  • float
  • integer
  • text
  • files

Tasks

task <name>
	<command>
	<command>
	...
	...
	<command>
endtask
  • name - The name of the task. Must be one of the following:
Name Context Required
main Node Yes
nodestart Node No
  • main - The main attraction! Run by the agents, once per job.
  • nodestart - Run once per logical "resource" before any main tasks are executed on the resource. In cluster-based environments, this may only run once. In cloud-based environments, it may be run on each spawned VM.

Task Commands

On Error

Set the agent's on-error behaviour for subsequent commands.

onerror <fail|ignore>
Parameters
  • fail - Fail the job.
  • ignore - Ignore the execution failure and continue. This is best used for unimportant commands.

Redirect

Configure stdout/stderr redirection for subsequent commands.

redirect stdout|stderr off
redirect stdout|stderr [append] to <file>
Parameters
  • file - A literal that contains the path of the file to write to.
    • Paths are relative to the run's assigned working directory.

Copy

Copy a file from the specified source to the specified destination.

copy [<context>:]<source> [<context>:]<destination>
Parameters
  • context - The context of the source/destination. Defaults to node.
    • root denotes the root (master) node, i.e. where Nimrod/G is running.
      • Paths are relative to the run's assigned working directory.
    • node denotes the agent currently executing the job.
      • Paths are relative to the job's temporary working directory.
  • source - A literal containing the path to the source file/directory. Use POSIX path separators ('/').
  • destination - A literal containing the path to the destination file/directory. Use POSIX path separators ('/').
Example
  • copy root:wing_test.zip .
  • copy ${aircraft_model}-results.txt root:./results

Execute

Execute a command on the assigned resource node.

exec <path> [<arg1> [<arg2> [...]]]
Parameters
  • path - A literal that contains the path of the file to execute. If not found, search the system's PATH environment variable.
  • arg{n} - A literal the contains the n'th argument to path.
    • This is actually the n+1'th argument if you count argv[0].
  • The value of argv[0] is substituted as the resolved path of the executable.
Example
  • exec python /path/to/script.py arg1

Execute (Shell)

Execute a command line on the system's default shell. Use this at your own risk. shexec <command>

Parameters
  • command - The command line to execute. This is passed directly to the system's default shell.
WARNING

This uses the system's DEFAULT shell. I.e. this might be cmd.exe, powershell.exe, or COMMAND.COM on Windows-based nodes, or /bin/bash, /bin/ash, or even just /bin/sh on *nix nodes.

Execute (L)

Execute a command on the assigned resource node, allowing argv[0] specification. Don't search the system's PATH.

This is equivalent to the execl system call on Unix systems.

lexec <path> [<arg0> [<arg1> [...]]]
Parameters
  • path - A literal containing the path of the file to execute. The system's PATH environment variable is NOT searched.
  • arg{n} - A literal the contains the n'th argument to path.
    • arg{n} = argv[n]
    • If arg0 is an empty string (given using ""), then it is substituted with the value of path.
    • It is not possible to specify an empty arg0 string.
Example
  • lexec /usr/bin/python /usr/bin/python /path/to/script.py arg1
  • lexec /usr/bin/python "" /path/to/script.py arg1

Execute (LP)

Execute a command on the assigned resource node, allowing argv[0] specification. Search the system's PATH if not found.

This is equivalent to the execlp system call on Unix systems.

lpexec <path> [<arg0> [<arg1> [...]]]
Parameters
  • path - A literal that contains the path of the file to execute. The system's PATH environment variable is searched.
  • arg{n} - A literal the contains the n'th argument to path.
    • arg{n} = argv[n]
    • If arg0 is an empty string (given using ""), then it is substituted with the resolved path of the executable.
    • It is not possible to specify an empty arg0 string.
Example
  • lpexec python python /path/to/script.py arg1
  • lpexec python "" /path/to/script.py arg1

Remarks

  • When resolving paths, if no extension is specified, the Windows agent will add ".exe" to the path.

Execution Environment

For each variable declaration, an environment variable with be created with the name of the variable. In addition, the Nimrod/G agent exposes some other predefined environment variables for use at your leisure.

Variable Description Parameter Example
NIMROD_EXPNAME The name of the current experiment. exp1
NIMROD_JOBUUID The UUID of the current job attempt. 96c0e1fa-866e-40dd-b948-39cd833640cc
NIMROD_TXURI The file transfer root URI. sftp://user@hpc.uni.com/home/user
NIMROD_JOBINDEX The index of the job. Starts at 1. $jobindex 1
NIMROD_VAR_x The value of the variable (parameter) with the name x. $x value-x-0
NIMROD_VAR_y The value of the variable (parameter) with the name y. $y value-y-0