Predicting ad click-through with decision tree

After several examples, it is now time to predict ad click-through with the decision tree algorithm we have just thoroughly learned about and practiced. We will use the dataset from a Kaggle machine learning competition, Click-Through Rate Prediction (https://www.kaggle.com/c/avazu-ctr-prediction). The dataset can be downloaded from https://www.kaggle.com/c/avazu-ctr-prediction/data.

Only the train.gz file contains labeled samples, so we only need to download this and unzip it (it will take a while). In this chapter, we focus on only the first 300,000 samples from the train file unzipped from train.gz.

The fields in the raw file are as follows:

We take a glance at the head of the file by running ...

Get Python Machine Learning By Example - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.