December 2018
Beginner to intermediate
684 pages
21h 9m
English
A different perspective on the challenge of adapting an algorithm to data is the trade-off between bias and variance that cause prediction errors beyond the natural noisiness of the data. A simple model that does not adequately capture the relationships in the data will underfit and exhibit bias, that is, make systematically wrong predictions. A model that is too complex will overfit and learn the noise in addition to any signal so that the result will show a lot of variance for different samples.
The key tool to diagnose this trade-off at any given iteration of the model selection and optimization process is the learning curve. It shows how training and validation errors depend on the sample size. This ...