Ensemble Models: A Whole Lot of Bad Pizza

On the American version of the popular TV show The Office, the boss, Michael Scott, buys pizza for his employees. Everyone groans when they learn that he has unfortunately bought pizza from Pizza by Alfredo instead of Alfredo's Pizza. Although it's cheaper, apparently pizza from Pizza by Alfredo is awful.

In response to their protests, Michael asks his employees a question: is it better to have a small amount of really good pizza or a lot of really bad pizza?

For many practical artificial intelligence implementations, the answer is arguably the latter. In the previous chapter, you built a single, good model for predicting pregnant households shopping at RetailMart. What if instead, you got democratic? What if you built a bunch of admittedly crappy models and let them vote on whether a customer was pregnant? The vote tally would then be used as a single prediction.

This type of approach is called ensemble modeling, and as you'll see, it turns simple observations into gold.

You'll be going over a type of ensemble model called bagged decision stumps, which is very close to an approach used constantly in industry called the random forest model. In fact, it's very nearly the approach I use daily in my own life here at MailChimp.com to predict when a user is about to send some spam.

After bagging, you'll investigate another awesome technique called boosting. Both of these techniques find creative ways to use the training data over and over ...

Get Data Smart: Using Data Science to Transform Information into Insight now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.