Embracing the Data-Mining Process
In This Chapter
Establishing a framework for data mining
Drilling into the CRISP-DM process
Establishing good habits
Data mining doesn’t have official rules. You have tremendous flexibility to define and refine your own work methods. Still, you’ll find benefits to understanding and following the approaches that work well for others.
The Cross-Industry Standard Process for Data Mining (CRISP-DM) is the dominant process framework for data mining. It’s an open standard; anyone may use it. This chapter explains each phase of the process.
Whose Standard Is It, Anyway?
The CRISP-DM process model is a step-by-step approach to data mining that was created by data miners for data miners. Participants from over 200 organizations (mainly a diverse group of businesses with an interest in using data mining internally or in promoting far-reaching use of data mining) provided input to develop the framework, which outlines key data-mining tasks in business terms and leaves users free to make their own choices about specific mathematical and computational approaches, and other technical matters.