Hyperparameters are one of the deciding factors for a machine learning model performance. One of the approaches to hyperparameter optimization is grid search. It is a method of finding the best set of hyperparameters for a learning algorithm by trying multiple combinations of them. It uses performance metric to compare the results. If you’re interested in detailed explanation, please visit wikipedia.
In Neptune, you can perform grid search of experiment’s parameters. To run grid search experiment, you need to specify:
- a set of values to try for each numeric parameter;
- a performance metric, based on one of numeric channels.
Neptune creates a group of experiments. Each experiment in a group is executed with different combination of parameters. Neptune keeps track of the best experiment using the defined metric.
For each parameter, instead of a single value, you can pass ranges or lists of values.
For example, if you want to perform a grid search for
x equal to three
values: 1, 10, and 100, and
y between 0 and 10 with step equal to 0.1,
run the following command:
# via ctx.params neptune send -p 'x:[1, 10, 100]' -p 'y:(0.0, 10.0, 0.1)' # via sys.argv neptune send main.py --x '%[1, 10, 100]' --y '%(0.0, 10.0, 0.1)'
You can run a grid search with string parameters as well:
# via ctx.params neptune send -p 'classifier:["SVM", "Naive Bayes", "Random Forest"]' # via sys.argv neptune send main.py --classifier '%["SVM", "Naive Bayes", "Random Forest"]'
Metric is a way for you to tell Neptune how to select the best experiment in a group. It consists of:
channel– name of a numeric channel defined in the experiment. The newest value in that channel will be always treated as a metric value;
You define the metric in configuration file, like this:
metric: channel: some_channel_name goal: maximize
Download the example project from this zip or from Neptune examples repo
git clone https://github.com/neptune-ml/neptune-examples.git cd neptune-examples/4-grid-search
main.py runs random forest regression on the diabetes
dataset. Two parameters of the model, namely number of trees and max
tree depth, are declared as experiment’s parameters in
metric: channel: mse goal: minimize
At the end the MSE
of the prediction is sent to the
mse channel. This channel is declared as a
metric for this experiment in
Now let’s say we want to find the best parameters for this model. We can run:
neptune send main.py --max_depth '%(1, 5, 1)' --n_estimators '%[2, 4, 8, 16]'
This creates group of 16 experiments. Each of them trains random forest with different parameters.
In Neptune Web UI, we can see the running experiments. Neptune is live-tracking the best experiment - in this case, the one with the lowest MSE: