book

Python Data Analysis Cookbook

by Ivan Idris

July 2016

Beginner to intermediate

462 pages

9h 14m

English

Packt Publishing

Read now

Unlock full access

Content preview from Python Data Analysis Cookbook

Recursively eliminating features

If we have many features (explanatory variables), it is tempting to include them all in our model. However, we then run the risk of overfitting—getting a model that works very well for the training data and very badly for unseen data. Not only that, but the model is bound to be relatively slow and require a lot of memory. We have to weigh accuracy (or an other metric) against speed and memory requirements.

We can try to ignore features or create new better compound features. For instance, in online advertising, it is common to work with ratios, such as the ratio of views and clicks related to an ad. Common sense or domain knowledge can help us select features. In the worst-case scenario, we may have to rely on correlations ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Start your free trial

Python Machine Learning Cookbook - Second Edition

Publisher Resources

ISBN: 9781785282287Supplemental Content

Python Data Analysis Cookbook

by Ivan Idris

Recursively eliminating features

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

You might also like

Python Machine Learning Cookbook - Second Edition

Python: End-to-end Data Analysis

Practical Data Analysis Cookbook

Python Data Science Essentials - Third Edition

Publisher Resources