Overview

Dags expose several ways to define dependencies between operations:

  • Using the dependencies field.
  • Using a parameter reference.
  • Using an event reference.

In addition to the dependencies definition, users can add trigger and conditions to perform extra checks on the state of those dependencies.

Dependencies

The dependencies is the simplest way to specify dependencies between operations in a DAG, it's explicit and requires that you specify a list of other tasks the current task depends on.

If an operation must wait for other operations and does not expect any parameters from those operations, you can define the dependency manually:

run:
  kind: dag
  operations:
    - name: job1
      hubRef: component1:latest
      params:
        ...
    - name: job2
      hubRef: component1:2.1
      params:
        ...
    - name: job3
      urlRef: https://some_url.com
      dependencies: [job1, job2]

job1 and job2 will run in parallel and job3 will wait for both jobs to finish.

Note that when a dependency is defined via the dependencies you can only trigger the operation when all upstream operation reach a final state following the trigger definition.

Param dependencies

If an operation is expecting a parameter from the upstream operations, we don't need to explicitly specify the dependencies fields for any operation that will have its dependency inferred from the params definition.

run:
  kind: dag
  operations:
    - name: job1
      hubRef: component1:latest
      params:
        ...
    - name: job2
      hubRef: component1:2.1
      params:
        ...
    - name: job3
      urlRef: https://some_url.com
      dependencies: [job1]
      params:
        image:
          ref: ops.job2
          value: outputs.results

This is similar to the previous dependencies definition in the sense that job1 and job2 will run in parallel and job3 will wait for both jobs to finish.

The dependency between job2 and job3 is inferred from the params definition.

Trigger

In order to define a trigger condition or how to trigger job3 based on the status job1 and job2, we can use the trigger field. It determines if a task should run based on the statuses of the upstream tasks.

- name: job3
  urlRef: https://some_url.com
  dependencies: [job1, job2]
  trigger: all_succeeded

Possible values: all_succeeded, all_failed, all_done, one_succeeded, one_failed, one_done

skipOnUpstreamSkip

if True, if any immediately upstream tasks are skipped, this task will automatically be skipped as well, regardless of other conditions or trigger. By default, this prevents tasks from attempting to use an incomplete context that won't be populated from the upstream tasks that didn't run. If False, the task's trigger will be used with any skipped operations considered successes.

- name: job3
  urlRef: https://some_url.com
  dependencies: [job1, job2]
  trigger: all_succeeded
  skipOnUpstreamSkip: true

Conditions

Conditions are an advanced tool for resolving dependencies between operations. Conditions take advantage of information resolved in the context to decide if an operation can be started, and they can be used to define branching strategies.

- name: job3
  urlRef: https://some_url.com
  dependencies: [job1]
  params:
    image:
      ref: ops.job2
      value: outputs.results
  conditions: '{{ image == "some-value" }}'
  skipOnUpstreamSkip: true

In the example above, job3 will only run if the param passed is equal to "some-value".

Event dependencies

Users will find that defining dependencies between operations using the dependencies fields is limiting, because it does not allow the user to specify which state of the task to depend on.

For example, a task may only be relevant to run if the dependent task start running, succeeded, failed, ...

Event dependencies provide the level of granularity and the option to define what specific event(s) should trigger the operation.

Also, since events define references, the dependency is inferred automatically and does not need to be set manually.

- name: job3
  hubRef: "component:version"
  events:
    - ref: ops.job2
     kinds: [run_status_running]

In this example job3 will be scheduled as soon as job2 starts running.

Note: For more details, please check the events section