Polyaxon provides a way to mount multiple volumes so that user can choose which volume(s) to mount for a specific job or experiment.

Deployment

Let’s consider the following persistence example:

persistence:
  data:
    data1:
      mountPath: "/data/1"
      hostPath: "/path/to/data"
      readOnly: true
    data2:
      mountPath: "/data/2"
      existingClaim: "data-2-pvc"
    data-foo:
      mountPath: "/data/foo"
      existingClaim: "data-foo-pvc"
  outputs:
    outputs1:
      mountPath: "/outputs/1"
      hostPath: "/path/to/outputs"
      readOnly: true
    outputs2:
      mountPath: "/outputs/2"
      existingClaim: "outputs-2-pvc"
    outputs-foo:
      mountPath: "/outputs/foo"
      existingClaim: "outputs-foo-pvc"

In this example, Polyaxon will have 3 data persistent storages (2 PVCs and one path on the host node) and 3 outputs persistent storages.

kubernetes-persistence-scheduling-1.png

Scheduling

When the user defines a multi data and/or outputs volumes, Polyaxon has a default behavior for mounting these volumes during the scheduling of the jobs and experiments, unless the user overrides this default behavior in the polyaxonfiles.

Data Scheduling

If the polyaxonfile, for running an experiment or a job, does not define the data volume or volumes that it needs access to, Polyaxon will by default mount all these volumes when it schedules the experiment or the job.

data-scheduling

These data volumes will be accessible to you as a dictionary {volume_name: path_to_data}, exported as an env variable POLYAXON_DATA_PATHS.

You can use as well our helper library polyaxon-helper to get access to this env variable automatically.

If on the other hand, you wish to only mount one volume or a subset of the volumes, then you need to provide this information in the polyaxonfile, e.g.

environment:
  persistence:
    data: ['data1', 'data-foo']

data-scheduling-2

By providing this persistence subsection, Polyaxon will only mount these volumes by looking up there names from the defined volumes.

Outputs Scheduling

Polyaxon mounts only one outputs for a particular experiment or a job.

If the polyaxonfile for running an experiment or a job does not define the outputs volume, Polyaxon will, by default mount one volume, either the first one or a random one from the list of the defined volumes.

outputs-scheduling-1

The outputs volume will be accessible to you as a string path_to_outputs_for_experiment, exported as an env variable POLYAXON_RUN_OUTPUTS_PATH.

You can use as well our helper library polyaxon-helper to get access to this env variable automatically.

If on the other hand, you wish to mount a particular volume, then you need to provide this information in the polyaxonfile, e.g.

environment: persistence: outputs: ‘outputs-foo’

outputs-scheduling-2

By providing this persistence subsection, Polyaxon will mount the volume by looking up the name from the defined volumes.