Idea for RFC for more dynamic job structure via variable data flow along step dependencies #111

mwiebe · 2026-02-13T21:07:24Z

mwiebe
Feb 13, 2026
Maintainer

Introduction

An early topic of discussion about Open Job Description templates was Immutability vs. dynamic adjustments. An important point is about cases where you don't know the full job structure at job submission time, either because that information is embedded in the input data files and hard to extract, or because that information requires some kind of computation to determine it.

This is an idea to add a new VAR_DATA_FLOW extension to the template specification. It depends on the EXPR extension (#79) for the expression language needed to reference step output variables. It also works well together with the idea for conditionally enabling/disabling a step via an expression (#81). A step's enableIf field could reference an output variable from an upstream step, allowing the job structure to adapt based on computed results.

Example 1: Video processing with a computed frame count

Consider a video processing example to process it frame-by-frame. The job submission context may or may not have the video file locally, and even if it does it may not have a program that can accurately and rapidly determine the frame count. The step that does per-frame processing needs a parameter space like "1-{FrameCount}", with a structure kind of like this:

                    ┌──────────────────────┐
                    │   Video to Frames    │
                    │   (1 task)           │
                    │                      │
                    │ Outputs:             │
                    │  FrameCount = ???    │
                    └──────────┬───────────┘
                               │
                           dependency
                      (FrameCount propagates)
                               │
                               ▼
                    ┌───────────────────────┐
                    │   Process Frames      │
                    │   (1 task per frame)  │
                    │                       │
                    │ parameterSpace:       │
                    │  Frame: 1-{FrameCount}│
                    └──────────┬────────────┘
                               │
                          dependency
                               │
                               ▼
                    ┌──────────────────────┐
                    │   Encode Video       │
                    │   (1 task)           │
                    └──────────────────────┘

In this example, the first step produces a scalar value (FrameCount) that determines the size of the second step's parameter space. The job structure — three sequential steps — is known at submission time, but the task count of the middle step is not.

Example 2: Distributed spatial processing with a computed DAG

Consider a distributed spatial processing algorithm on a 3D scene, such as hierarchical lightmap baking, global illumination, or spatial partitioning. The scene graph needs to be analyzed first to determine how to break it into subparts and what order to process them in. This analysis produces a DAG (directed acyclic graph) of processing tasks with dependency edges — for example, leaf nodes of a spatial hierarchy can be processed in parallel, but parent nodes must wait for their children.

We can run this kind of job by extending the idea from the video processing example — but here the first step produces not just a scalar, but an entire graph structure that defines both the task count and the interior task-task dependency structure of the next step.

The Idea for RFC to represent task-task dependencies (#82) proposes a dependsOnSubspace mechanism and shows a DAG example using a LIST[LIST[INT]] job parameter as an adjacency list. That discussion's example assumes the adjacency list is known at submission time and provided as a job parameter default. This proposal extends that: the adjacency list is computed by a prior step and flows along the step dependency edge.

                        ┌───────────────────────────┐
                        │   Analyze Scene Graph     │
                        │   (1 task)                │
                        │                           │
                        │ Outputs:                  │
                        │  NodeCount = ???          │
                        │  AdjList = ???            │
                        │  e.g. [[], [0], [0,1],    │
                        │        [0], [1]]          │
                        └─────────────┬─────────────┘
                                      │
                                  dependency
                       (NodeCount + AdjList propagate)
                                      │
                                      ▼
                        ┌────────────────────────────┐
                        │   Process Graph Nodes      │
                        │   (1 task per node)        │
                        │                            │
                        │ parameterSpace:            │
                        │  NodeIndex: 0-{NodeCount-1}│
                        │                            │
                        │ self-dependency via AdjList│
                        │  (task-task deps from #82) │
                        └────────────────────────────┘

In the notation from #82, the second step would look something like:

- name: ProcessGraphNode
  dependencies:
    - dependsOn: AnalyzeSceneGraph
    # Self-dependency using the computed adjacency list
    - dependsOn: ProcessGraphNode
      dependsOnSubspace:
        - name: NodeIndex
          range: "{{Step.AnalyzeSceneGraph.AdjList[Task.Param.NodeIndex]}}"
  parameterSpace:
    taskParameterDefinitions:
      - name: NodeIndex
        type: INT
        range: "0-{{len(Step.AnalyzeSceneGraph.AdjList) - 1}}"
  script: ...

This differs from the DAG example in #82 in a key way: there, Param.GraphAdjList is a job parameter with a known default. Here, both the adjacency list and the node count come from the output of a prior step. The job template author knows the shape of the job (analyze, then process as a DAG) but not the size or structure of the DAG until the analysis step runs.

This example also shows that the data flowing along a step dependency isn't limited to simple scalars — it can be structured data like LIST[LIST[INT]], which requires the type extensions proposed in #79.

Idea

To work well, this needs the Idea for RFC to extend types and the template substitution language implemented first.
We add a way for a typed variable to be output from a step, and be available for template expression substitution in all steps that have a downstream dependency from it
The resulting template should still be mostly checkable in a static way, so we require the job template to declare all variable names, with their types, that will flow through the step graph.
We require each step to declare all the variables it will output. If a task runs and does not output a declared variable, that's an error.
When a task of a step runs, it sets a variable with a new openjd_output_var: var_name=<JSON value convertible the var_name's type>. This is similar to how environments can set environment variables.
This adds a new state that a step can be in for a while, "unable to evaluate yet." A compute farm can model that in different ways, e.g. it can delay creating that step until the dependencies are done and the variable values are available, or it can create the step in a pending state and update it when those values are at hand.
There will need to be some kind of reduction operation when a step has multiple tasks. E.g. put all the variable values into a list in the parameter space iteration order, take the maximum/minimum value, take the sum, etc.

Here's an end-to-end example of the video processing job (Example 1) showing all of these working together. It also uses the expression language syntax from #79.

name: "Video Frame Processing"
specificationVersion: "jobtemplate-2023-09"
extensions: [VAR_DATA_FLOW, FEATURE_BUNDLE_1, EXPR]
parameterDefinitions:
  - name: InputVideo
    type: PATH
    objectType: FILE
    dataFlow: IN
    description: The input video file to process.
  - name: OutputVideo
    type: PATH
    objectType: FILE
    dataFlow: OUT
    description: The output video file path.
  - name: WorkspacePath
    type: PATH
    objectType: DIRECTORY
    dataFlow: OUT
    description: A shared directory for intermediate data that persists across steps.
    default: workspace
steps:
  - name: VideoToFrames
    # Declare the typed variables this step will output.
    # These become available to downstream steps via Step.VideoToFrames.<name>.
    outputVariables:
      - name: FrameCount
        type: INT
    bash:
      script: |
        set -euo pipefail

        FRAMES_DIR={{repr_sh(Param.WorkspacePath / "frames")}}
        mkdir -p "$FRAMES_DIR"

        # Extract all frames from the video
        ffmpeg -i {{repr_sh(Param.InputVideo)}} "$FRAMES_DIR/frame_%06d.png"

        # Count the extracted frames and output the variable
        FRAME_COUNT=$(ls "$FRAMES_DIR"/frame_*.png | wc -l)
        echo "openjd_output_var: FrameCount=$FRAME_COUNT"

  - name: ProcessFrames
    dependencies:
      - dependsOn: VideoToFrames
    parameterSpace:
      taskParameterDefinitions:
        - name: Frame
          type: INT
          # The range uses the output variable from the upstream step.
          # This cannot be evaluated until VideoToFrames completes.
          range: "1-{{Step.VideoToFrames.FrameCount}}"
    bash:
      script: |
        set -euo pipefail

        FRAME=frame_{{Task.Param.Frame.zfill(6)}}.png
        INPUT={{repr_sh(Param.WorkspacePath / "frames")}}/"$FRAME"
        OUTPUT={{repr_sh(Param.WorkspacePath / "processed")}}/"$FRAME"

        mkdir -p {{repr_sh(Param.WorkspacePath / "processed")}}
        convert "$INPUT" -auto-level -sharpen 0x1 "$OUTPUT"

  - name: EncodeVideo
    dependencies:
      - dependsOn: ProcessFrames
    bash:
      script: |
        set -euo pipefail
        ffmpeg -framerate 24 \
          -i {{repr_sh(Param.WorkspacePath / "processed" / "frame_%06d.png")}} \
          -c:v libx264 -pix_fmt yuv420p \
          {{repr_sh(Param.OutputVideo)}}

The new pieces in this template are:

outputVariables on the VideoToFrames step — declares that this step will produce an INT variable called FrameCount. This is part of the template's static structure, so a scheduler or validator can check that downstream references like Step.VideoToFrames.FrameCount are well-typed before any task runs.
openjd_output_var: FrameCount=... in the script — the runtime mechanism by which the running task actually sets the value, analogous to how openjd_env sets environment variables in an Environment's onEnter action.
Step.VideoToFrames.FrameCount in the ProcessFrames parameter space range — a new value reference that reads the output variable from a completed upstream step. The scheduler knows it cannot evaluate this expression (and therefore cannot create tasks for ProcessFrames) until VideoToFrames has finished.

crowecawcaw · 2026-02-18T17:26:34Z

crowecawcaw
Feb 18, 2026
Collaborator

Use case: I was building a job that would have benefited from this feature this week. I need a job that runs a task for each object under an S3 prefix. For now, I made a Python script that discovers the S3 objects then creates a job based on the discovered objects. I wanted to use Deadline's job bundle submitter with this job but can't because the Python script needs to run before the template exists. It would be much better to have a self-contained job with discovery step which finds the S3 objects then a processing step which is parameterized over those objects.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea for RFC for more dynamic job structure via variable data flow along step dependencies #111

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Idea for RFC for more dynamic job structure via variable data flow along step dependencies #111

Uh oh!

Uh oh!

mwiebe Feb 13, 2026 Maintainer

Introduction

Example 1: Video processing with a computed frame count

Example 2: Distributed spatial processing with a computed DAG

Idea

Replies: 1 comment

Uh oh!

crowecawcaw Feb 18, 2026 Collaborator

mwiebe
Feb 13, 2026
Maintainer

crowecawcaw
Feb 18, 2026
Collaborator