Chapter 4. Hyperparameter Tuning
In the realm of machine learning, hyperparameter tuning is a “meta” learning task. It happens to be one of my favorite subjects because it can appear like black magic, yet its secrets are not impenetrable. In this chapter, we’ll talk about hyperparameter tuning in detail: why it’s hard, and what kind of smart tuning methods are being developed to do something about it.
Model Parameters Versus Hyperparameters
First, let’s define what a hyperparameter is, and how it is different from a normal nonhyper model parameter.
Machine learning models are basically mathematical functions that represent the relationship between different aspects of data. For instance, a linear regression model uses a line to represent the relationship between “features” and “target.” The formula looks like this:
![]()
where x is a vector that represents features of the data and y is a scalar variable that represents the target (some numeric quantity that we wish to learn to predict).
This model assumes that the relationship between x and y is linear. The variable w is a weight vector that represents the normal vector for the line; it specifies the slope of the line. This is what’s known as a model parameter, which is learned during the training phase. “Training a model” involves using an optimization procedure to determine the best model parameter that “fits” the data.
There is another ...