Note: Grid Search in Polyaxon CE is supported in eager mode only with no concurrency management.

grid

V1GridSearch

polyaxon.polyflow.matrix.grid_search.V1GridSearch(kind='grid', params=None, num_runs=None, seed=None, concurrency=None, early_stopping=None)

Grid search is essentially an exhaustive search through a manually specified set of hyperparameters.

User can possibly limit the number of experiments and not traverse the whole search space created by providing numRuns.

Grid search does not allow the use of distributions, and requires that all values of the params definition to be discrete values.

  • Args:
    • kind: str, should be equal grid
    • params: List[Dict[str, params]]
    • concurrency: int, optional
    • num_runs: int, optional
    • early_stopping: List[EarlyStopping], optional

YAML usage

matrix:
  kind: grid
  concurrency:
  params:
  numRuns:
  earlyStopping:

Python usage

from polyaxon.polyflow import (
    V1GridSearch, V1HpLogSpace, V1HpChoice, V1FailureEarlyStopping, V1MetricEarlyStopping
)
matrix = V1GridSearch(
  concurrency=2,
  params={"param1": V1HpLogSpace(...), "param2": V1HpChoice(...), ... },
  num_runs=5
  early_stopping=[V1FailureEarlyStopping(...), V1MetricEarlyStopping(...)]
)

Fields

kind

The kind signals to the CLI, client, and other tools that this matrix is a grid search.

If you are using the python client to create the mapping, this field is not required and is set by default.

matrix:
  kind: grid

concurrency

An optional value to set the number of concurrent operations.

This value only makes sense if less or equal to the total number of possible runs.
matrix:
  kind: grid
  concurrency: 2

For more details about concurrency management, please check the concurrency section.

params

A dictionary of key -> value generator to generate the parameters.

Gird search can only use discrete value.

The parameters generated will be validated against the component's inputs/outputs definition to check that the values can be passed and have valid types.

matrix:
  kind: grid
  params:
    param1:
       kind: ...
       value: ...
    param2:
       kind: ...
       value: ...

numRuns

Maximum number of runs to start based on the search space defined.

matrix:
  kind: grid
  numRuns: 5

earlyStopping

A list of early stopping conditions to check for terminating all operations managed by the pipeline. If one of the early stopping conditions is met, a signal will be sent to terminate all running and pending operations.

matrix:
  kind: grid
  earlyStopping: ...

For more details please check the early stopping section.

Example

This example will define 10 experiments based on the cartesian product of lr and dropout possible values.

version: 1.1
kind: operation
matrix:
  kind: grid
  concurrency: 2
  params:
    lr:
      kind: logspace
      value: 0.01:0.1:5
    dropout:
      kind: choice
      value: [0.2, 0.5]
   early_stopping:
     - metric: accuracy
       value: 0.9
       optimization: maximize
     - metric: loss
       value: 0.05
       optimization: minimize
component:
  inputs:
    - name: batch_size
      type: int
      isOptional: true
      value: 128
    - name: lr
      type: float
    - name: dropout
      type: float
  container:
    image: image:latest
    command: [python3, train.py]
    args: ["--batch-size={{ batch_size }}", "--lr={{ lr }}", "--dropout={{ dropout }}"]