8
Missing data and imputation methods
Missing data are a pervasive problem in many data sets and seem especially widespread in social and economic studies, such as customer satisfaction surveys. Imputation is an intuitive and flexible way to handle the incomplete data sets that result. We discuss imputation, multiple imputation (MI), and other strategies to handle missing data, together with their theoretical background. Our focus is on MI, which is a statistically valid strategy for handling missing data, although we also review other valid approaches, such as direct maximum likelihood and Bayesian methods for estimating parameters, as well as less sound methods. The creation of multiply-imputed data sets is more challenging than their analysis, but still relatively straightforward relative to other valid methods, and we discuss available software for MI. Some examples and advice on computation are provided using the ABC 2010 annual customer satisfaction survey. Ad hoc methods, including using singly-imputed data sets, almost always lead to invalid inferences and should be eschewed.
8.1 Introduction
Missing values are a common problem in many data sets and seem especially widespread in social and economic studies, including customer satisfaction surveys, where customers may fail to express their satisfaction level concerning their experience with a specific business because of lack of interest, unwillingness to criticize ...