Multiple imputation in practice

There are a few steps to follow and decisions to make when using this powerful imputation technique:

  • Is the data MAR? And be honest! If the mechanism is likely not MAR, then more complicated measures have to be taken.
  • Are there any derived terms, redundant variables, or irrelevant variables in the data set? Any of these types of variables will interfere with the regression process. Irrelevant variables—such as unique IDs—will not have any predictive power. Derived terms or redundant variables, such as having a column for weight in pounds and grams, or a column for an area in addition to a length and width column, will similarly interfere with the regression step.
  • Convert all categorical variables to factors, ...

Get Data Analysis with R - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.