Oftentimes, teams will have several environments with different resources and access from different users. Allocating operations to the right resources while ensuring a fair queueing is an important behavior, especially when you scale your workload.
Polyaxon provides several interfaces designed to achieve fairness when a limited resource is shared, for example, to prevent a hyperparameter tuning with large search space or parallel executions from consuming more cluster resources than other workflows and operations.
Polyaxon provides several tools to:
- Limit workflows from running a large number of concurrent operations.
- Prioritize some important operations.
- Route operations that require special resources to the right node(s), namespace, or cluster.
- Split your workload over several nodes and clusters.
There are several distinct features involved in the scheduling strategies:
- Node scheduling: A feature that leverages the Kubernetes API to select nodes for running your operations.
- Resources scheduling: A feature that leverages the Kubernetes API to enable GPU/TPU, or other special resources for your operations.
- Queue priority: A feature to prioritize operations on a queue.
- Queue concurrency: A feature to throttle the number of operations on a queue.
- Queue agent: A feature to route operations on a queue to a namespace or cluster.
- Workflow concurrency: A feature to limit the number of operations queued from a single workflow or nested workflows.
- Run profile: A feature for injecting certain information into operations at compilation time to preset configuration for node scheduling, queue routing, resources requirements and definition, connections, and access level control.