June 2011
Beginner to intermediate
744 pages
25h 11m
English
This chapter introduces the basic concepts of data preprocessing and the methods for data preprocessing are organized into the following categories: data cleaning, data integration, data reduction, and data transformation. Data have quality if they satisfy the requirements of the intended use. There are many factors comprising data quality, including accuracy, completeness, consistency, timeliness, believability, and interpretability. There are several data preprocessing techniques. Data cleaning can be applied to remove noise and correct inconsistencies in data. Data integration merges data from multiple sources into a coherent data store such as a data warehouse. Data reduction can reduce data size by, for instance, aggregating, ...