June 2011
Beginner to intermediate
744 pages
25h 11m
English
3.1 Data quality can be assessed in terms of several issues, including accuracy, completeness, and consistency. For each of the above three issues, discuss how data quality assessment can depend on the intended use of the data, giving examples. Propose two other dimensions of data quality.
3.2 In real-world data, tuples with missing values for some attributes are a common occurrence. Describe various methods for handling this problem.
3.3 Exercise 2.2 gave the following data (in increasing order) for the attribute age: 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
(a) Use smoothing by bin means to smooth these data, using a bin depth of 3. Illustrate your steps. Comment ...