Chapter 9

Graphics and Data Quality: How Good Are the Data?

Beauty is less important than quality.

Eugene Ormandy


Chapter 9 discusses data quality, including missing values and outliers, what they are, and ways of identifying them.

9.1 Introduction

Good analyses depend on good data. Checking and cleaning data are basic tasks that always have to be carried out and yet are rarely explicitly discussed. Textbooks—and indeed R packages—often present datasets only after they have been cleaned and filtered. The detailed work that has gone into getting them into their semi-pristine condition is passed over and sometimes even swept under the carpet. One of the problems is that there are so many different ways that data can be of poor quality, ...

