book

Learn Python by Building Data Science Applications

by Philipp Kats, David Katz

August 2019

Beginner

482 pages

12h 56m

English

Packt Publishing

Read now

Unlock full access

Content preview from Learn Python by Building Data Science Applications

Chapter 14

What is overfitting?

Many ML models (for example, decision trees) actively fit to perform well on the training set at hand, but at some point, this process goes beyond generalizable knowledge that's valuable for the task, with some parts being irrelevant to the test set. This is not only meaningless but will also affect the model's performance on other data. This phenomenon is known as overfitting, and there are ways to overcome it.

Why should we use cross-validation?

Cross-validation is a technique that's aimed at overcoming the issue of overfitting. In its basic form, it splits a training set into multiple folds, trains multiple models with the same settings on different combinations of those folds, and measures their performance ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 9781789535365Supplemental Content

Learn Python by Building Data Science Applications

by Philipp Kats, David Katz

Chapter 14

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Python for Data Science

Introduction to Machine Learning with Python

Python Data Science Handbook

Python for Geospatial Data Analysis

Publisher Resources