Hyperparameter Optimization

Hyperparameters are one of the deciding factors for a machine learning model performance. One of the approaches to hyperparameter optimization is grid search. It is a method of finding the best set of hyperparameters for a learning algorithm by trying multiple combinations of them. It uses performance metric to compare the results. If you’re interested in detailed explanation, please visit wikipedia.

In Neptune, you can perform grid search of experiment’s parameters. To run grid search experiment, you need to specify:

  • a set of values to try for each numeric parameter;
  • a performance metric, based on one of numeric channels.

Neptune creates a group of experiments. Each experiment in a group is executed with different combination of parameters. Neptune keeps track of the best experiment using the defined metric.

Parameter Values

For each parameter, instead of a single value, you can pass ranges or lists of values.

For example, if you want to perform a grid search for x equal to three values: 1, 10, and 100, and y between 0 and 10 with step equal to 0.1, run the following command:

# via ctx.params
neptune send -p 'x:[1, 10, 100]' -p 'y:(0.0, 10.0, 0.1)'

# via sys.argv
neptune send main.py --x '%[1, 10, 100]' --y '%(0.0, 10.0, 0.1)'

You can run a grid search with string parameters as well:

# via ctx.params
neptune send -p 'classifier:["SVM", "Naive Bayes", "Random Forest"]'

# via sys.argv
neptune send main.py --classifier '%["SVM", "Naive Bayes", "Random Forest"]'

Metric

Metric is a way for you to tell Neptune how to select the best experiment in a group. It consists of:

  • channel – name of a numeric channel defined in the experiment. The newest value in that channel will be always treated as a metric value;
  • goalminimize or maximize.

You define the metric in configuration file, like this:

metric:
  channel: some_channel_name
  goal: maximize

Full Example

Download the example project from this zip or from Neptune examples repo

git clone https://github.com/neptune-ml/neptune-examples.git
cd neptune-examples/4-grid-search

A script main.py runs random forest regression on the diabetes dataset. Two parameters of the model, namely number of trees and max tree depth, are declared as experiment’s parameters in neptune.yaml.

metric:
  channel: mse
  goal: minimize

At the end the MSE of the prediction is sent to the mse channel. This channel is declared as a metric for this experiment in neptune.yaml.

Now let’s say we want to find the best parameters for this model. We can run:

neptune send main.py --max_depth '%(1, 5, 1)' --n_estimators '%[2, 4, 8, 16]'

This creates group of 16 experiments. Each of them trains random forest with different parameters.

In Neptune Web UI, we can see the running experiments. Neptune is live-tracking the best experiment - in this case, the one with the lowest MSE:

gridsearch_best screenshot