Best practice 16 – reducing overfitting

We've touched on ways to avoid overfitting when discussing the pros and cons of algorithms in the last practice. We herein formally summarize them, as follows:

  • Cross-validation, a good habit that we have built over all of the chapters in this book.
  • Regularization. It adds penalty terms to reduce the error caused by fitting the model perfectly on the given training set.
  • Simplification, if possible. The more complex the mode is, the higher chance of overfitting. Complex models include a tree or forest with excessive depth, a linear regression with high degree polynomial transformation, and an SVM with a complicated kernel.
  • Ensemble learning, combining a collection of weak models to form a stronger one. ...

Get Python Machine Learning By Example - Second Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.