V1Job

polyaxon.polyflow.run.job.V1Job(kind='job', environment=None, connections=None, volumes=None, init=None, sidecars=None, container=None)

Jobs are used to train machine learning models, process a dataset, execute generic tasks and can be used to perform a variety of functions from compiling a model to running an ETL operation.

YAML usage

run:
  kind: job
  environment:
  connections:
  volumes:
  init:
  sidecars:
  container:

Python usage

from polyaxon.polyflow import V1Environment, V1Init, V1Job
from polyaxon.k8s import k8s_schemas
job = V1Job(
    environment=V1Environment(...),
    connections=["connection-name1"],
    volumes=[k8s_schemas.V1Volume(...)],
    init=[V1Init(...)],
    sidecars=[k8s_schemas.V1Container(...)],
    container=k8s_schemas.V1Container(...),
)

Fields

kind

The kind signals to the CLI, client, and other tools that this component’s runtime is a job.

If you are using the python client to create the runtime, this field is not required and is set by default.

run:
  kind: job

environment

Optional environment section, it provides a way to inject pod related information.

run:
  kind: job
  environment:
    labels:
       key1: "label1"
       key2: "label2"
     annotations:
       key1: "value1"
       key2: "value2"
     nodeSelector:
       node_label: node_value
     ...
 ...

connections

A list of connection names to resolve for the job.

If you are referencing a connection it must be configured. All referenced connections will be checked:
  • If they are accessible in the context of the project of this run

  • If the user running the operation can have access to those connections

After checks, the connections will be resolved and inject any volumes, secrets, configMaps, environment variables for your main container to function correctly.

run:
  kind: job
  connections: [connection1, connection2]

volumes

A list of Kubernetes Volumes to resolve and mount for your jobs.

This is an advanced use-case where configuring a connection is not an option.

When you add a volume you need to mount it manually to your container(s).

run:
  kind: job
  volumes:
    - name: volume1
      persistentVolumeClaim:
        claimName: pvc1
  ...
  container:
    name: myapp-container
    image: busybox:1.28
    command: ['sh', '-c', 'echo custom init container']
    volumeMounts:
    - name: volume1
      mountPath: /mnt/vol/path

init

A list of init handlers and containers to resolve for the job.

If you are referencing a connection it must be configured. All referenced connections will be checked:
  • If they are accessible in the context of the project of this run

  • If the user running the operation can have access to those connections

run:
  kind: job
  init:
    - artifacts:
        dirs: ["path/on/the/default/artifacts/store"]
    - connection: gcs-large-datasets
      artifacts:
        dirs: ["data"]
      container:
        resources:
          requests:
            memory: "256Mi"
            cpu: "500m"
    - container:
      name: myapp-container
      image: busybox:1.28
      command: ['sh', '-c', 'echo custom init container']

sidecars

A list of sidecar containers that will used as sidecars.

run:
  kind: job
  sidecars:
    - name: sidecar2
      image: busybox:1.28
      command: ['sh', '-c', 'echo sidecar2']
    - name: sidecar1
      image: busybox:1.28
      command: ['sh', '-c', 'echo sidecar1']
      resources:
        requests:
          memory: "128Mi"
          cpu: "500m"

container

The main Kubernetes Container that will run your experiment training or data processing logic.

run:
  kind: job
  container:
    name: tensorflow:2.1
    init:
      - connection: my-tf-code-repo
    command: ["python", "/plx-context/artifacts/my-tf-code-repo/model.py"]