Chapter 9
Dealing with Missing or Incomplete Data
In This Chapter
Seeing the different ways in which observations can be missing from a dataset
Understanding what types of problems can be caused by missing data
Learning how to overcome the problems caused by missing data
Missing data is a major problem in all areas of statistical analysis. Incomplete information can make it impossible to use many types of statistical techniques; for example, a paired t-test cannot be run unless there are equal numbers of observations for two variables.
Missing observations can severely distort the results of any statistical procedure, calling into question the validity of the results. Fortunately, many new techniques have been developed in recent years to manage the problem of missing data. The use of these techniques has accelerated, partially due to the development of highly sophisticated statistical software packages.
This chapter introduces several potential causes for missing ...
Get Statistics for Big Data For Dummies now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.