Chapter 29Case Study, Part 1: Business Understanding, Data Preparation, and EDA
In Chapter 29–31 we shall bring together much of what we have learned in this book in a detailed Case Study: Predicting Response to Direct-Mail Marketing. We follow the here in Chapter 29, we (i) enunciate our objectives in the Business Understanding Phase, (ii) get a feel for the data set in Part 1 of the Data Understanding Phase, prepare our data in the Data Preparation Phase, and extract some useful information in Part 2 of the Data Understanding Phase: exploratory data analysis (EDA). Then, in Chapter 30, we learn about possible segments in the customer database using clustering analysis and we investigate relationships among the predictors using principal components analysis. Finally, in Chapter 31, we apply the rich assortment of classification techniques at our disposal in the Modeling Phase, and make recommendations on which models to move forward with in the Evaluation Phase.
29.1 Cross-Industry Standard Practice for Data Mining
The Case Study in Chapter 29–31 will be carried out using the cross-industry standard process for data mining (CRISP-DM). According to CRISP-DM, a given data mining project has a life cycle consisting of six phases, as illustrated in Figure 29.1. The details of CRISP-DM are discussed in Chapter 1; here, we but recapitulate the outline of the process.
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access