Chapter 29Case Study, Part 1: Business Understanding, Data Preparation, and EDA
In Chapter 29–31 we shall bring together much of what we have learned in this book in a detailed Case Study: Predicting Response to Direct-Mail Marketing. We follow the here in Chapter 29, we (i) enunciate our objectives in the Business Understanding Phase, (ii) get a feel for the data set in Part 1 of the Data Understanding Phase, prepare our data in the Data Preparation Phase, and extract some useful information in Part 2 of the Data Understanding Phase: exploratory data analysis (EDA). Then, in Chapter 30, we learn about possible segments in the customer database using clustering analysis and we investigate relationships among the predictors using principal components analysis. Finally, in Chapter 31, we apply the rich assortment of classification techniques at our disposal in the Modeling Phase, and make recommendations on which models to move forward with in the Evaluation Phase.
29.1 Cross-Industry Standard Practice for Data Mining
The Case Study in Chapter 29–31 will be carried out using the cross-industry standard process for data mining (CRISP-DM). According to CRISP-DM, a given data mining project has a life cycle consisting of six phases, as illustrated in Figure 29.1. The details of CRISP-DM are discussed in Chapter 1; here, we but recapitulate the outline of the process.
Get Data Mining and Predictive Analytics, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.