Chapter 9

Graphics and Data Quality: How Good Are the Data?

Beauty is less important than quality.

Eugene Ormandy

Summary

Chapter 9 discusses data quality, including missing values and outliers, what they are, and ways of identifying them.

9.1 Introduction

Good analyses depend on good data. Checking and cleaning data are basic tasks that always have to be carried out and yet are rarely explicitly discussed. Textbooks—and indeed R packages—often present datasets only after they have been cleaned and filtered. The detailed work that has gone into getting them into their semi-pristine condition is passed over and sometimes even swept under the carpet. One of the problems is that there are so many different ways that data can be of poor quality, ...

Get Graphical Data Analysis with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.