Chapter 19. When Should You Trust Your Predictions?
A predictive model can almost always produce a prediction, given input data. However, in plenty of situations it is inappropriate to produce such a prediction. When a new data point is well outside of the range of data used to create the model, making a prediction may be an inappropriate extrapolation. A more qualitative example of an inappropriate prediction would be when the model is used in a completely different context. The cell segmentation data used in Chapter 14 flags when human breast cancer cells can or cannot be accurately isolated inside an image. A model built from these data could be inappropriately applied to stomach cells for the same purpose. We can produce a prediction, but it is unlikely to be applicable to the different cell type.
This chapter discusses two methods for quantifying the potential prediction quality:
- Equivocal zones
-
This method uses the predicted values to alert the user that results may be suspect.
- Applicability
-
This method uses the predictors to measure the amount of extrapolation (if any) for new samples.
Equivocal Results
If a model result indicated that you had a 51% chance of having contracted COVID-19, it would be natural to view the diagnosis with some skepticism. In fact, regulatory bodies often require many medical diagnostics to have an equivocal zone. This zone is a range of results in which the prediction should not be reported to patients, for example, some range of COVID-19 ...
Get Tidy Modeling with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.