Chapter 9

Dealing with Missing or Incomplete Data

In This Chapter

arrow Seeing the different ways in which observations can be missing from a dataset

arrow Understanding what types of problems can be caused by missing data

arrow Learning how to overcome the problems caused by missing data

Missing data is a major problem in all areas of statistical analysis. Incomplete information can make it impossible to use many types of statistical techniques; for example, a paired t-test cannot be run unless there are equal numbers of observations for two variables.

technicalstuff You use a paired t-test to test the hypothesis that the means of two different populations are equal to each other.

Missing observations can severely distort the results of any statistical procedure, calling into question the validity of the results. Fortunately, many new techniques have been developed in recent years to manage the problem of missing data. The use of these techniques has accelerated, partially due to the development of highly sophisticated statistical software packages.

This chapter introduces several potential causes for missing ...

Get Statistics for Big Data For Dummies now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.