Polyaxon provides several features for running parallel jobs and hyperparameter tuning.
Oftentimes you may want to create many experiments with different parameters and automatically manage their execution.
In order to make this tutorial usable for all Polyaxon users, we will run several configurations in parallel using the eager mode, and we will use algorithms supported in all Polyaxon distributions:
- Grid Search
- Random Search
- Parallel Mapping
In order to run the commands in this section with the eager mode, you need to install Polyaxon with
pip install "polyaxon[numpy]"
To run these commands without the eager mode, and have a fully automated pipeline managing the executions and controlling concurrency limits and early stopping conditions, you need to have access to Polyaxon EE or Polyaxon Cloud.
Let's run another polyaxonfile
hyperparams_grid.yml, which contains a hyperparameter tuning definition with grid seach algorithm, this is the content of the file:
version: 1.1 kind: operation matrix: kind: grid params: learning_rate: kind: linspace value: 0.001:0.1:5 dropout: kind: choice value: [0.25, 0.3] conv_activation: kind: choice value: [relu, sigmoid] epochs: kind: choice value: [5, 10] urlRef: https://raw.githubusercontent.com/polyaxon/polyaxon-quick-start/master/experimentation/typed.yml
This is an operation based on the same component. Instead of defining a single set of params, similar to what we did in previous sections of this tutorial, this file defines a matrix, in this case, with the grid search algorithm.
It uses the same component, Polyaxon validates the space search generated against the inputs and outputs defined in the component. Polyaxon will generate multiple operations based on the search space, and it will manage their execution using a pipeline.
Starting a hyperparameter tuning is similar to any other operation:
$ polyaxon run --url https://raw.githubusercontent.com/polyaxon/polyaxon-quick-start/master/automation/hyperparams_grid.yml --eager
If you don't provide the
--eager flag Polyaxon will:
- Run the hyperparameter tuning pipeline in a managed mode if you have access to Polyaxon EE or Polyaxon Cloud.
- Raise an exception if you are using Polyaxon CE.
For more details check the grid search reference
hyperparams_random.yml polyaxonfile is similar to the grid search polyaxonfile, the only difference is that it defines a random search matrix section:
version: 1.1 kind: operation matrix: kind: random numRuns: 10 params: learning_rate: kind: linspace value: 0.001:0.1:5 dropout: kind: choice value: [0.25, 0.3] conv_activation: kind: pchoice value: [[relu, 0.1], [sigmoid, 0.8]] epochs: kind: choice value: [5, 10] urlRef: https://raw.githubusercontent.com/polyaxon/polyaxon-quick-start/master/experimentation/typed.yml
To run this polyaxonfile:
$ polyaxon run --url https://raw.githubusercontent.com/polyaxon/polyaxon-quick-start/master/automation/hyperparams_random.yml --eager
Random search also provides access to the continuous distributions in addition to the discrete distributions.
For more details check the random search reference
Sometimes you might want to run parallel executions and provide your own suggestions instead of using an algorithm provided by Polyaxon. Mapping is how you can provide a predefined space.
Mapping can be also used to parallelize a job for fetching data, or loading information from a source to a destination concurrently.
mapping.yml polyaxonfile defines all the values that we want to use for running our component:
version: 1.1 kind: operation matrix: kind: mapping values: - learning_rate: 0.001 dropout: 0.25 conv_activation: relu epochs: 5 - learning_rate: 0.01 dropout: 0.5 conv_activation: sigmoid epochs: 5 urlRef: https://raw.githubusercontent.com/polyaxon/polyaxon-quick-start/master/experimentation/typed.yml
Starting a mapping is also similar to any other operation:
$ polyaxon run --url https://raw.githubusercontent.com/polyaxon/polyaxon-quick-start/master/automation/mapping.yml --eager
For more details check the mapping reference
For users with Polyaxon EE or Polyaxon Cloud access, there are also tools to control the caching for experiments with similar configurations, and concurrency for managing the number of parallel jobs. And every pipeline in Polyaxon can also define early stopping strategies.
The repo contains more hyperparameter tuning examples in the automation folder.