V1Init
polyaxon._flow.init.V1Init()
Polyaxon init section exposes an interface for users to run init containers before the main container containing the logic for training models or processing data.
Polyaxon init section is an extension of Kubernetes init containers.
Polyaxon init section has special handlers for several connections in addition to the possibility for the users to provide their own containers and run any custom init containers which can contain utilities or setup scripts not present in the main container.
By default, all built-in handlers will mount and initialize data under the path
/plx-context/artifacts/{{connection-name}}
unless the user passes a custom path
.
- Args:
- paths: Union[List[str], List[[str, str]], optional, list of subpaths or a list of [path from, path to].
- artifacts: V1ArtifactsType, optional
- git: V1GitType, optional
- dockerfile: V1DockerfileType, optional
- file: V1FileType, optional
- tensorboard: V1TensorboardType, optional
- lineage_ref: str, optional
- model_ref: str, optional
- artifact_ref: str, optional
- connection: str, optional
- path: str, optional
- container: Kubernetes Container, optional
YAML usage
You can only use one of the possibilities for built-in handlers, otherwise an exception will be raised. It’s possible to customize the container used with the default built-in handlers.
version: 1.1
kind: component
run:
kind: job
init:
- artifacts:
dirs: ["path/on/the/default/artifacts/store"]
- lineageRef: "281081ab11794df0867e80d6ff20f960:artifactLineageRef"
- artifactRef: "artifactVersion"
- artifactRef: "otherProjectName:version"
- modelRef: "modelVersion"
- modelRef: "otherProjectName:version"
- connection: gcs-large-datasets
artifacts:
dirs: ["data"]
container:
resources:
requests:
memory: "256Mi"
cpu: "500m"
- connection: s3-datasets
path: "/s3-path"
artifacts:
files: ["data1", "path/to/data2"]
- connection: repo1
- git:
revision: branch2
connection: repo2
- dockerfile:
image: test
run: ["pip install package1"]
env: {'KEY1': 'en_US.UTF-8', 'KEY2':2}
- file:
name: script.sh
chmod: "+x"
content: |
echo test
- container:
name: myapp-container
image: busybox:1.28
command: ['sh', '-c', 'echo custom init container']
container:
...
Python usage
Similar to the YAML example if you pass more than one handler, an exception will be raised. It’s possible to customize the container used with the default built-in handlers.
from polyaxon.schemas import V1Component, V1Init, V1Job
from polyaxon.types import V1ArtifactsType, V1DockerfileType, V1GitType
from polyaxon import k8s
component = V1Component(
run=V1Job(
init=[
V1Init(
artifacts=V1ArtifactsType(dirs=["path/on/the/default/artifacts/store"])
),
V1Init(
lineage_ref="281081ab11794df0867e80d6ff20f960:artifactLineageRef"
),
V1Init(
artifact_ref="artifactVersion"
),
V1Init(
artifact_ref="otherProjectName:version"
),
V1Init(
model_ref="modelVersion"
),
V1Init(
model_ref="otherProjectName:version"
),
V1Init(
connection="gcs-large-datasets",
artifacts=V1ArtifactsType(dirs=["data"]),
container=k8s.V1Container(
resources=k8s.V1ResourceRequirements(requests={"memory": "256Mi", "cpu": "500m"}),
)
),
V1Init(
path="/s3-path",
connection="s3-datasets",
artifacts=V1ArtifactsType(files=["data1", "path/to/data2"])
),
V1Init(
connection="repo1",
),
V1Init(
connection="repo2",
git=V1GitType(revision="branch2")
),
V1Init(
dockerfile=V1DockerfileType(
image="test",
run=["pip install package1"],
env={'KEY1': 'en_US.UTF-8', 'KEY2':2},
)
),
V1Init(
dockerfile=V1FileType(
name="test.sh",
content="echo test",
chmod="+x",
)
),
V1Init(
container=k8s.V1Container(
name="myapp-container",
image="busybox:1.28",
command=['sh', '-c', 'echo custom init container']
)
),
],
container=k8s.V1Container(...)
)
)
Understanding init section
In both the YAML and Python example we are telling Polyaxon to initialize:
- A directory
path/on/the/default/artifacts/store
from the defaultartfactsStore
, because we did not specify a connection and we invoked an artifacts handler. - An artifact lineage reference based on the run generating the artifact and its lineage name.
- Two model versions, one version from the same project
and a second a version from a different project.
(it’s possible to provide the FQN
org_name/model_name:version
) - Two artifact versions, one version from the same project
and a second a version from a different project.
(it’s possible to provide the FQN
org_name/model_name:version
) - A directory
data
from a GCS connection namedgcs-large-datasets
, we also customized the built-in init container with a new resources section. - Two files
data1
,path/to/data2
from an S3 connection nameds3-datasets
, and we specified that the 2 files should be initialized under/s3-path
instead of the default path that Polyaxon uses. - A repo configured under the connection name
repo1
will be cloned from the default branch. - A repo configured under the connection name
repo2
will be cloned from the branch namebranch2
. - A dockerfile will be generated with the specification that was provided.
- A custom container will finally run our own custom code, in this case an echo command.