O'Reilly logo

Clojure for Data Science by Henry Garner

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Bias and variance

Overfitting is a problem that occurs with machine learning algorithms that are able to generate very accurate results on a training dataset but fail to generalize very well from what they've learned. We say that models which have overfit the data have very high variance. When we trained our decision tree on data that included the numeric age of passengers, we were overfitting the data.

Conversely, certain models may have very high bias. This is a situation where the model has a strong tendency towards a certain outcome irrespective of the training examples to the contrary. Recall our example of a classifier that always predicts that a survivor will perish. This classifier would perform well on dataset with low survivor rates, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required