Overview
Dags expose several ways to define dependencies between operations:
- Using the
dependenciesfield. - Using a parameter reference.
- Using an event reference.
In addition to the dependencies definition, users can add trigger and conditions
to perform extra checks on the state of those dependencies.
Dependencies
The dependencies is the simplest way to specify dependencies between operations in a DAG,
it’s explicit and requires that you specify a list of other tasks the current task depends on.
If an operation must wait for other operations and does not expect any parameters from those operations, you can define the dependency manually:
run:
kind: dag
operations:
- name: job1
hubRef: component1:latest
params:
...
- name: job2
hubRef: component1:2.1
params:
...
- name: job3
urlRef: https://some_url.com
dependencies: [job1, job2]job1 and job2 will run in parallel and job3 will wait for both jobs to finish.
Note that when a dependency is defined via the
dependenciesyou can only trigger the operation when all upstream operation reach a final state following thetriggerdefinition.
Param dependencies
If an operation is expecting a parameter from the upstream operations,
we don’t need to explicitly specify the dependencies fields for any operation that will have
its dependency inferred from the params definition.
run:
kind: dag
operations:
- name: job1
hubRef: component1:latest
params:
...
- name: job2
hubRef: component1:2.1
params:
...
- name: job3
urlRef: https://some_url.com
dependencies: [job1]
params:
image:
ref: ops.job2
value: outputs.resultsThis is similar to the previous dependencies definition in the sense that
job1 and job2 will run in parallel and job3 will wait for both jobs to finish.
The dependency between job2 and job3 is inferred from the params definition.
Trigger
In order to define a trigger condition or how to trigger job3 based on the status job1 and job2,
we can use the trigger field.
It determines if a task should run based on the statuses of the upstream tasks.
- name: job3
urlRef: https://some_url.com
dependencies: [job1, job2]
trigger: all_succeededPossible values: all_succeeded, all_failed, all_done, one_succeeded, one_failed, one_done
skipOnUpstreamSkip
if True, if any immediately upstream tasks are skipped,
this task will automatically be skipped as well, regardless of other conditions or trigger.
By default, this prevents tasks from attempting to use an incomplete context
that won’t be populated from the upstream tasks that didn’t run.
If False, the task’s trigger will be used with any skipped operations considered successes.
- name: job3
urlRef: https://some_url.com
dependencies: [job1, job2]
trigger: all_succeeded
skipOnUpstreamSkip: trueConditions
Conditions are an advanced tool for resolving dependencies between operations. Conditions take advantage of information resolved in the context to decide if an operation can be started, and they can be used to define branching strategies.
- name: job3
urlRef: https://some_url.com
dependencies: [job1]
params:
image:
ref: ops.job2
value: outputs.results
conditions: '{{ image == "some-value" }}'
skipOnUpstreamSkip: trueIn the example above, job3 will only run if the param passed is equal to “some-value”.
Event dependencies
Users will find that defining dependencies between operations using the dependencies fields is limiting,
because it does not allow the user to specify which state of the task to depend on.
For example, a task may only be relevant to run if the dependent task start running, succeeded, failed, …
Event dependencies provide the level of granularity and the option to define what specific event(s) should trigger the operation.
Also, since events define references, the dependency is inferred automatically and does not need to be set manually.
- name: job3
hubRef: "component:version"
events:
- ref: ops.job2
kinds: [run_status_running]In this example job3 will be scheduled as soon as job2 starts running.
Note: For more details, please check the events section