Chapter 3. Offline Evaluation Mechanisms: Hold-Out Validation, Cross-Validation, and Bootstrapping
Now that we’ve discussed the metrics, let’s re-situate ourselves in the machine learning model workflow that we unveiled in Figure 1-1. We are still in the prototyping phase. This stage is where we tweak everything: features, types of model, training methods, etc. Let’s dive a little deeper into model selection.
Unpacking the Prototyping Phase: Training, Validation, Model Selection
Each time we tweak something, we come up with a new model. Model selection refers to the process of selecting the right model (or type of model) that fits the data. This is done using validation results, not training results. Figure 3-1 gives a simplified view of this mechanism.
In Figure 3-1, hyperparameter tuning is illustrated as a “meta” process that controls the training process. We’ll discuss exactly how it is done in Chapter 4. Take note that the available historical dataset is split into two parts: training and validation. The model training process receives training data and produces a model, which is evaluated on validation data. The results from validation are passed back to the hyperparameter tuner, which tweaks some knobs and trains the model again.
The question is, why must the model be evaluated on two different datasets? ...