7Ensemble Models: A Whole Lot of Bad Pizza

On the American version of the popular TV show The Office, the boss, Michael Scott, buys pizza for his employees.

Perhaps you remember the episode. Everyone groans when they learn that he has unfortunately bought pizza from Pizza by Alfredo instead of Alfredo's Pizza. Although it's cheaper, apparently pizza from Pizza by Alfredo is awful.

In response to their protests, Michael asks his employees a question: is it better to have a small amount of really good pizza or a lot of really bad pizza?

Take a moment to think about this one.

When it comes to AI, many implementations embody the latter. In the previous chapter, you built a single, good model for predicting pregnant households shopping at RetailMart. On the other hand, what if, instead, you were democratic? These models would then vote on whether a customer was pregnant. The vote tally becomes a single prediction. That's ensemble modeling for you, and as you'll see, it spins simple observations into gold.

This chapter first introduces bagged decision stumps, a type of ensemble method. In fact, it's nearly the approach used daily to predict when a user is about to send some spam.

After bagging, we'll investigate another awesome technique called boosting. Both of these techniques find creative ways to use the training data over and over and over again (to train up an entire ensemble of classifiers). This methodology is similar to naïve Bayes in spirit: it represents a stupidity that, ...

Get Data Smart, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.