10 Learning from a Small Testing Sample and Prediction

To illustrate these recipes, we will use data from the JMP challenge presented at the ENBIS 2012 conference in Ljubljana. The methods recommended are regression and decision tree analysis, and other possibilities include neural networks.

10.1 Recipe 17: To Predict Demographic Signs (Like Sex, Age, Education and Income)

Industry: The recipe is relevant to everybody who needs a clear customer demographic profile to improve their business, for example, companies publishing banners and other groups preparing specific advertising offers.

Areas of interest: The recipe is relevant to marketing, sales, online promotions and strategic decisions.

Challenge: The challenge is to get a clear picture of the demographic distribution of customers and prospects or to replace missing values in the database (referred to as imputation). The latter is the more common situation. Even if fields like sex, age or income are available in the datasets, it quite often happens that large parts of the data are missing because the process supplying the values did not ensure that these fields were completed. But this ...

Get A Practical Guide to Data Mining for Business and Industry now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.