Polyaxon Runtime/Jobs/Services/Notebooks/Tensorboard/DAGs/Hyperparameter Tuning

Runtime Concepts

Polyaxon relies on a set of concepts to manage the experimentation and the automation process. In this section, we provide a high-level introduction to these concepts, with more details in pages dedicated to each concept.

Runtime of a Component

A component is the model that describes the discrete and containerized logic you want to run, they optionally take inputs, perform some work, and optionally return some outputs.

Components can process data directly, train a model, or orchestrate external systems, they can be built using any programming language. There are almost no restrictions on what a component can do.

Furthermore, each component receives metadata about its environment and upstream dependencies (if it’s defined in a DAG) before it runs, it’s called the context, even if it does not receive any explicit data inputs, giving it an opportunity to change its behavior depending on the context it’s running inside.

Since Polyaxon runs containers, it is agnostic to the code each component runs and there are no restrictions on what inputs and outputs can be.

Please refer to core/specification/component to learn about the component specification and management/Component Hub for details about the Component Hub.

Each component can have one runtime that it’s specified in the run section of a component. Polyaxon supports several runtimes:

Jobs
Distributed Jobs
Services
DAGs

Job

A job is the execution of your code with data/connections and the provided parameters on the Kubernetes cluster.

A Job can be:

A machine learning experiment.
A data processing job.
An ETL task.
A container build job.

Please refer to experimentation/jobs for more details.

Distributed Jobs

Polyaxon supports distributed jobs for model training or data processing via several Kubernetes operators:

Please refer to experimentation/distrbuted-jobs for more details.

Service

A service allows to run dashboards, apps, and APIs.

A service can be:

A Tensorboard.
A Notebook.
A custom dashboard.
A Streamlit app.
A container exposing an API.

Please refer to experimentation/services for more details.

DAG

A DAG is a powerful tool to describe dependencies between operations, it allows to author a directed acyclic graph of operations with first class support for states and artifacts dependencies.

Please refer to automation/flow-engine for more details.

Operation

An operation is how you execute your components, it allows you to:

pass the parameters for required inputs or override the default values of optional inputs.
patch the definition of the component to set environments, initializers, and resources.
set termination logic and retries.
set trigger logic to start a component in a pipeline context.
parallelize or map the component over a matrix of parameters.
put an operation on a schedule.
subscribe a component to events to trigger executions automatically.

Please refer to core/specification/operation and management/Runs Dashboard to learn about the operation specification.

Matrix

A matrix is an automatic and practical way to run a component with different parameters based on a mapping or a hyperparameter search algorithm.

Please refer to automation/optimization-engine and automation/mapping for more details.

Intro Runtime Concepts

Runtime Concepts

Runtime of a Component

Job

Distributed Jobs

Service

DAG

Operation

Matrix

Concepts

Concepts

Version

Improve this page!

Have a feedback?