O'Reilly logo

Predictive Analytics Using Rattle and Qlik Sense by Ferran Garcia Pagans

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Underfitting and overfitting

Underfitting and overfitting are problems not just with a classifier but for all supervised methods.

Imagine you have a classifier with just one rule that tries to distinguish between healthy and not healthy patients. The rule is as follows:

If Temperature < 37 then Healthy

This classifier will classify all patients with a lower temperature than 37 degrees, as healthy. This classifier will have a huge error rate. The tree that represents this rule will have only the root node and two branches, with a leaf in each branch.

Underfitting occurs when the tree is too short to classify a new observation correctly; the rules are too general.

On the other hand, if we have a dataset with many attributes, and if we generate a very ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required