book

Learn Python by Building Data Science Applications

by Philipp Kats, David Katz

August 2019

Beginner

482 pages

12h 56m

English

Packt Publishing

Read now

Unlock full access

Content preview from Learn Python by Building Data Science Applications

Understanding cross-validation

In the previous chapter, we built a model with certain assumptions and settings, measuring its performance with accuracy metrics (the overall ratio of correctly classified labels). To do this, we split our data randomly into training and testing sets. While that approach is fundamental, it has its problems. Most importantly, this way, we may fine-tune our model to gain better performance on the test dataset but at the expense of other data (in other words, we might make the model worse while getting a better metric on the specific dataset). This phenomenon is called overfitting.

To combat this issue, we'll use a slightly more complex approach: cross-validation. In its basic form, cross-validation creates multiple ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Publisher Resources

ISBN: 9781789535365Supplemental Content

Learn Python by Building Data Science Applications

by Philipp Kats, David Katz

Understanding cross-validation

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Python for Data Science

Introduction to Machine Learning with Python

Python Data Science Handbook

Python for Geospatial Data Analysis

Publisher Resources