V1SparkReplica
polyaxon.polyflow.run.spark.replica.V1SparkReplica(replicas=None, environment=None, init=None, sidecars=None, container=None)
Spark replica is the specification for a Spark executor or driver.
- Args:
- replicas: str, int
- environment: V1Environment, optional
- init: List[V1Init], optional
- sidecars: List[sidecar containers], optional
- container: Kubernetes Container
YAML usage
executor/driver:
replicas
environment:
init:
sidecars:
container:
Python usage
from polyaxon.polyflow import V1Environment, V1Init, V1SparkReplica
from polyaxon.k8s import k8s_schemas
replica = V1SparkReplica(
replicas=2,
environment=V1Environment(...),
init=[V1Init(...)],
sidecars=[k8s_schemas.V1Container(...)],
container=k8s_schemas.V1Container(...),
)
Fields
replicas
The number of replica (executor/driver) instances.
executor:
replicas: 2
environment
Optional environment section, it provides a way to inject pod related information into the replica (executor/driver).
driver:
environment:
labels:
key1: "label1"
key2: "label2"
annotations:
key1: "value1"
key2: "value2"
nodeSelector:
node_label: node_value
...
...
init
A list of init handlers and containers to resolve for the replica (executor/driver).
If you are referencing a connection it must be configured. All referenced connections will be checked:
If they are accessible in the context of the project of this run
If the user running the operation can have access to those connections
executor:
init:
- artifacts:
dirs: ["path/on/the/default/artifacts/store"]
- connection: gcs-large-datasets
artifacts:
dirs: ["data"]
container:
resources:
requests:
memory: "256Mi"
cpu: "500m"
- container:
name: myapp-container
image: busybox:1.28
command: ['sh', '-c', 'echo custom init container']
sidecars
A list of sidecar containers that will used as sidecars.
driver:
sidecars:
- name: sidecar2
image: busybox:1.28
command: ['sh', '-c', 'echo sidecar2']
- name: sidecar1
image: busybox:1.28
command: ['sh', '-c', 'echo sidecar1']
resources:
requests:
memory: "128Mi"
cpu: "500m"
container
The main Kubernetes Container that will run your experiment training or data processing logic for the replica (executor/driver).
executor:
kind: job
container:
name: tensorflow:2.1
init:
- connection: my-tf-code-repo
command: ["python", "/plx-context/artifacts/my-tf-code-repo/model.py"]