Chapter 10. Resampling for Evaluating Performance

We have already covered several pieces that must be put together to evaluate the performance of a model. Chapter 9 described statistics for measuring model performance. Chapter 5 introduced the idea of data spending, and we recommended the test set for obtaining an unbiased estimate of performance. However, we usually need to understand the performance of a model or even multiple models before using the test set.


Typically we can’t decide on which final model to use with the test set before first assessing model performance. There is a gap between our need to measure performance reliably and the data splits (training and testing) we have available.

In this chapter, we describe an approach called resampling that can fill this gap. Resampling estimates of performance can generalize to new data in a similar way as estimates from a test set. Chapter 11 complements this one by demonstrating statistical methods that compare resampling results.

In order to fully appreciate the value of resampling, let’s first take a look the resubstitution approach, which can often fail.

The Resubstitution Approach

When we measure performance on the same data that we used for training (as opposed to new data or testing data), we say we have resubstituted the data. Let’s again use the Ames data to demonstrate these concepts. The end of Chapter 8 summarizes the current state of our Ames analysis. It includes a recipe object named ames_rec, a linear ...

Get Tidy Modeling with R now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.