Chapter 9
Beauty is less important than quality.
Eugene Ormandy
Chapter 9 discusses data quality, including missing values and outliers, what they are, and ways of identifying them.
Good analyses depend on good data. Checking and cleaning data are basic tasks that always have to be carried out and yet are rarely explicitly discussed. Textbooks—and indeed R packages—often present datasets only after they have been cleaned and filtered. The detailed work that has gone into getting them into their semi-pristine condition is passed over and sometimes even swept under the carpet. One of the problems is that there are so many different ways that data can be of poor quality, ...
No credit card required