Skip to Content
Data Smart: Using Data Science to Transform Information into Insight
book

Data Smart: Using Data Science to Transform Information into Insight

by John W. Foreman
November 2013
Beginner to intermediate
432 pages
10h 39m
English
Wiley
Audiobook available
Content preview from Data Smart: Using Data Science to Transform Information into Insight

7

Ensemble Models: A Whole Lot of Bad Pizza

On the American version of the popular TV show The Office, the boss, Michael Scott, buys pizza for his employees. Everyone groans when they learn that he has unfortunately bought pizza from Pizza by Alfredo instead of Alfredo's Pizza. Although it's cheaper, apparently pizza from Pizza by Alfredo is awful.

In response to their protests, Michael asks his employees a question: is it better to have a small amount of really good pizza or a lot of really bad pizza?

For many practical artificial intelligence implementations, the answer is arguably the latter. In the previous chapter, you built a single, good model for predicting pregnant households shopping at RetailMart. What if instead, you got democratic? What if you built a bunch of admittedly crappy models and let them vote on whether a customer was pregnant? The vote tally would then be used as a single prediction.

This type of approach is called ensemble modeling, and as you'll see, it turns simple observations into gold.

You'll be going over a type of ensemble model called bagged decision stumps, which is very close to an approach used constantly in industry called the random forest model. In fact, it's very nearly the approach I use daily in my own life here at MailChimp.com to predict when a user is about to send some spam.

After bagging, you'll investigate another awesome technique called boosting. Both of these techniques find creative ways to use the training data over and over ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Building an Effective Data Science Practice: A Framework to Bootstrap and Manage a Successful Data Science Practice

Building an Effective Data Science Practice: A Framework to Bootstrap and Manage a Successful Data Science Practice

Vineet Raina, Srinath Krishnamurthy
Python: Advanced Predictive Analytics

Python: Advanced Predictive Analytics

Ashish Kumar, Joseph Babcock

Publisher Resources

ISBN: 9781118661468Purchase book