V1Operation
polyaxon._flow.operations.operation.V1Operation()
An operation is how Polyaxon executes a component by passing parameters, connections, and a run environment.
With an operation users can:
- Pass the parameters for required inputs or override the default values of optional inputs.
- Patch the definition of the component to set environments, initializers, and resources.
- Set termination logic and retries.
- Set trigger logic to start a component in a pipeline context.
- Parallelize or map the component over a matrix of parameters.
- Put an operation on a schedule.
- Subscribe a component to events to trigger executions automatically.
After resolution and compilation, Polyaxon will prepare an executable that will be scheduled on Kubernetes:
- Args:
- version: str
- kind: str, should be equal to
operation
- patch_strategy: str, optional, defaults to post_merge
- is_preset: bool, optional
- is_approved: bool, optional
- name: str, optional
- description: str, optional
- tags: List[str], optional
- presets: str, optional
- queue: str, optional
- namespace: str, optional
- cache: V1Cache, optional
- termination: V1Termination, optional
- plugins: V1Plugins, optional
- params: Dict[str, V1Param], optional
- schedule: Union[V1CronSchedule, V1IntervalSchedule, V1DateTimeSchedule], optional
- events: List[V1EventTrigger], optional
- build: V1Build, optional
- hooks: List[V1Hook], optional
- matrix: Union[V1Mapping, V1GridSearch, V1RandomSearch, V1Hyperband, V1Bayes, V1Hyperopt, V1Iterative], optional
- joins: List[V1Join], optional
- dependencies: dependencies, optional
- trigger: trigger, optional
- conditions: conditions, optional
- skip_on_upstream_skip: skip_on_upstream_skip, optional
- run_patch: Dict, optional
- hub_ref: str, optional
- dag_ref: str, optional
- url_ref: str, optional
- path_ref: str, optional
- component: V1Component, optional
- template: V1Template, optional
YAML usage
operation:
version: 1.1
kind: operation
patchStrategy:
isPreset:
isApproved:
name:
description:
tags:
presets:
queue:
namespace:
cache:
termination:
plugins:
events:
actions:
hooks:
params:
build:
runPatch:
hubRef:
dagRef:
pathRef:
component:
template:
Python usage
from polyaxon.schemas import (
V1Build, V1Cache, V1Component, V1Hook, V1Param, V1Plugins, V1Operation, V1Termination
)
from polyaxon.schemas import V1PatchStrategy
operation = V1Operation(
patch_strategy=V1PatchStrategy.REPLACE,
name="test",
description="test",
tags=["test"],
presets=["test"],
queue="test",
namespace="test",
cache=V1Cache(...),
termination=V1Termination(...),
plugins=V1Plugins(...),
events=["event-ref1", "event-ref2"],
hooks=[V1Hook(...)],
outputs={"param1": V1Param(...), ...},
build=V1Build(...),
component=V1Component(...),
)
Fields
version
The polyaxon specification version to use to validate the operation.
operation:
version: 1.1
kind
The kind signals to the CLI, client, and other tools that this is an operation.
If you are using the python client to create an operation, this field is not required and is set by default.
operation:
kind: component
patchStrategy
Defines how the compiler should handle keys that are defined on the component,
or how to merge multiple presets when using the override behavior -f
.
There are four strategies:
replace
: replaces all keys with new values if provided.isnull
: only applies new values if the keys have empty/None values.post_merge
: applies deep merge where newer values are applied last.pre_merge
: applies deep merge where newer values are applied first.
isPreset
This is a flag to tell if this operation must be validated or is only a preset that will be used with the override behavior to inject extra information to the main operation specification.
For instance a user might want to define a scheduling behavior that applies to several operations. One way to do that is to set the environment section on every operation. But sometimes the same scheduling behavior makes sense for several operations and components. In that case, the user can define an operation preset to extract that logic:
isPreset: true
runPatch:
environment:
nodeSelector:
node_label: node_value
and use the override behavior to inject that section dynamically:
polyaxon run -f component -f scheduling-preset.yaml
Note: Please check this in-depth section about presets.
name
The name to use for this operation run, if provided, it will override the component’s name otherwise the name of the component will be used if it exists.
operation:
name: test
description
The description to use for this operation run, if provided, it will override the component’s description otherwise the description of the component will be used if it exists.
operation:
description: test
tags
The tags to use for this operation run, if provided, it will override the component’s tags otherwise the tags of the component will be used if it exists.
operation:
tags: [test]
presets
The presets to use for this operation run, if provided, it will override the component’s presets otherwise the presets of the component will be used if it exists.
operation:
presets: [test]
queue
The queue to use for this operation run, if provided, it will override the component’s queue otherwise the queue of the component will be used if it exists.
operation:
queue: agent-name/queue-name
If the agent name is not specified, Polyaxon will resolve the name of the queue based on the default agent.
operation:
queue: queue-name
namespace
Note: Please note that this field is only available in some commercial editions.
The namespace to use for this operation run, if provided, it will override the component’s namespace otherwise the namesace of the component will be used if it exists or it will default to the agent’s namespace.
operation:
namespace: polyaxon
cache
The cache to use for this operation run, if provided, it will override the component’s cache otherwise the cache of the component will be used if it exists.
operation:
cache:
disable: false
ttl: 100
termination
The termination to use for this operation run, if provided, it will override the component’s termination otherwise the termination of the component will be used if it exists.
operation:
termination:
maxRetries: 2
plugins
The plugins to use for this operation run, if provided, it will override the component’s plugins otherwise the plugins of the component will be used if it exists.
operation:
name: debug
...
plugins:
auth: false
collectLogs: false
...
params
The params to pass to the component,
they will be validated against the inputs/outputs.
If a parameter is passed and the component does not define a corresponding inputs/outputs,
a validation error will be raised unless the param has the contextOnly
flag enabled.
operation:
params:
param1: {value: 1.1}
param2: {value: test}
param3: {ref: ops.upstream-operation, value: outputs.metric}
...
build
Note: Please check V1Build for more details.
This section defines if this operation should build a container before starting the main logic. If the build section is provided, Polyaxon will set the main operation to a pending state until the build is done and then it will use the resulting docker image for starting the main container.
operation:
...
build:
hubRef: kaniko
...
runPatch
The run patch provides a way to override information about the component’s run section, for example the container’s resources or the environment section.
The run patch is a dictionary that can modify most of the runtime information and will be resolved against the corresponding run kind:
- V1Job: for running batch jobs, model training experiments, data processing jobs, …
- V1Service: for running tensorboards, notebooks, streamlit, custom services or an API.
- V1TFJob: for running distributed Tensorflow training job.
- V1PytorchJob: for running distributed Pytorch training job.
- V1PaddleJob: for running distributed PaddlePaddle training job.
- V1MXJob: for running distributed MXNet training job.
- V1XGBoostJob: for running distributed XGBoost training job.
- V1MPIJob: for running distributed MPI job.
- V1RayJob: for running a Ray job.
- V1DaskJob: for running a Dask job.
- V1Dag: for running a DAG/workflow.
For example, if we define a generic component for running Jupyter Notebook:
version: 1.1
kind: component
name: notebook
run:
kind: service
ports: [8888]
container:
image: "jupyter/tensorflow-notebook"
command: ["jupyter", "lab"]
args: [
"--no-browser",
"--ip=0.0.0.0",
"--port={{globals.ports[0]}}",
"--allow-root",
"--NotebookApp.allow_origin=*",
"--NotebookApp.trust_xheaders=True",
"--NotebookApp.token=",
"--NotebookApp.base_url={{globals.base_url}}",
"--LabApp.base_url={{globals.base_url}}"
]
This component is generic, and does not define resources requirements, if for instance this component is hosted on github and you don’t want to modify the component while at the same time you want to request a GPU for the notebook, you can patch the run:
version: 1.1
kind: operation
urlRef: https://raw.githubusercontent.com/org/repo/master/components/notebook.yaml
runPatch:
container:
resources:
limits:
nvidia.com/gpu: 1
By applying a run patch you can effectively share components while having full control over customizable details.
hubRef
Polyaxon provides a Component Hub for hosting versioned components with an access control system to improve the productivity of your team.
To run a component hosted on Polyaxon Component Hub, you can use hubRef
version: 1.1
kind: operation
hubRef: myComponent:v1.1
...
dagRef
If you are building a dag and you have a component that can be used by several operations,
you can define a component and reuse it in all operations using dagRef
.
Please check Polyaxon automation’s flow engine section
for more details.
urlRef
You can host your components on an accessible url, e.g github, and reference those components without downloading the data manually.
version: 1.1
kind: operation
urlRef: https://raw.githubusercontent.com/org/repo/master/components/my-component.yaml
...
Please note that you can only use this reference when using the CLI tool.
pathRef
In many situations, components can be placed in different folders within a project, e.g. data-processing, data-exploration, ml-modeling, …
You can define operations without the need to change the directory by referencing a path to that component:
version: 1.1
kind: operation
pathRef: ../data-processing/component-clean.yaml
...
Please note that you can only use this reference when using the CLI tool.
component
If you are still in the development phase or if you are building a singleton operation that can be executed in a unique way, you can define the component inline inside the operation:
version: 1.1
kind: operation
component:
run:
kind: job
container:
image: foo:latest
command: train --lr=0.01
...
isApproved
This is a flag to trigger human validation before queuing and scheduling an operation.
the default behavior is True
even when the field is not set, i.e. no validation is required.
To require a human validation prior to scheduling an operation,
you can set this field to False
.
isApproved: false
Cost
A field to define the cost of running the operation, the value is a float and should map to a convention of a cost estimation in your team or it can map directly to the cost of using the environment where the operation is running.
cost: 2.2