CHAPTER 3
Data Preparation
Data preparation is the start of the data mining process. The data mining
results heavily rely on the data quality prepared before the mining process.
It is a process that involves many different tasks and which cannot be fully
automated. Many of the data preparation activities are routine, tedious, and
time consuming. It has been estimated that data preparation accounts for
60 percent to 80 percent of the time spent on a data mining project. Figure
3.0.1 shows the main steps of data mining. From the fi gure, we can see that
the data preparation takes an important role in data mining.
Data preparation is essential for ...
Get Applied Data Mining now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.