Chapter 5

Embracing the Data-Mining Process

In This Chapter

arrow Establishing a framework for data mining

arrow Drilling into the CRISP-DM process

arrow Establishing good habits

Data mining doesn’t have official rules. You have tremendous flexibility to define and refine your own work methods. Still, you’ll find benefits to understanding and following the approaches that work well for others.

The Cross-Industry Standard Process for Data Mining (CRISP-DM) is the dominant process framework for data mining. It’s an open standard; anyone may use it. This chapter explains each phase of the process.

Whose Standard Is It, Anyway?

The CRISP-DM process model is a step-by-step approach to data mining that was created by data miners for data miners. Participants from over 200 organizations (mainly a diverse group of businesses with an interest in using data mining internally or in promoting far-reaching use of data mining) provided input to develop the framework, which outlines key data-mining tasks in business terms and leaves users free to make their own choices about specific mathematical and computational approaches, and other technical matters.

The explanation of the CRISP-DM process in this chapter ...

Get Data Mining For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.