Chapter 2Overview of Methods for Dealing with Missing Data
Regardless of the existence of missing data, the end result of any analysis is to make valid and efficient inferences about the population of interest. Neyman and Pearson 1933 established valid criteria for evaluating any statistical procedure. These criteria include having a small bias, where bias refers to the difference between the average sample estimate and its true value, and a small variance associated with the average sample estimate (efficiency). Bias and variance can be combined in a single measure called the mean square error (MSE) so that the bias, variance, and MSE describe the behavior of the estimate. Using these criteria, we discuss the various missing data methods that are available, each with its own strengths and limitations. This chapter provides a brief overview of the existing approaches and classifies them according to whether they remove observations with missing data, utilize all available data, or impute missing data. An excellent overview of missing data methods is provided by Schafer and Graham 2002.
2.1 Methods that Remove Observations
Many approaches simplify the missing data problem by discarding data. We discuss how most of these approaches may lead to biased results, although some approaches try to directly address this issue. Discarding data also reduces the sample size, which in turn reduces efficiency and leads to larger standard errors for the parameters of interest.
2.1.1 Complete-Case ...
Get Applied Missing Data Analysis in the Health Sciences now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.