In high-dimensional settings, where you have many possible signals that can be included in your model, you need to be careful to select the best model for predicting future data and avoid overfit. To do this, you first use recipes that provide a good array of candidate models. You then select among these candidates to minimize estimates for the error rate when predicting on new data. This chapter introduces the key tools for such high-dimensional modeling.
In the previous chapter, we introduced deviance as a measure of how tightly your model fits the training data. When you apply your models for prediction ...