In this chapter, I will attempt to answer three questions:
Our discussion focuses on two case studies, which will help us address these questions. One case study has a continuous dependent variable (“target” as Modeler users would call it), and the other has a binary dependent variable. Along the way, we will learn a number of tricks and tips. As you may have guessed, it is indeed possible to do data mining effectively in SPSS Statistics, but it is not always obvious how to perform all of the tasks, or even what the required tasks are.
My own definition of data mining has evolved slightly over the years, but this one has served me well:
Data mining uses historical data, accumulated during the normal course of doing business, and involves selecting, preparing, and analyzing the data, finding (and confirming) previously unknown patterns, building predictive models, and deploying the models on current data.
Each element of the definition is worth elaborating: