There are a few steps to follow and decisions to make when using this powerful imputation technique:
- Is the data MAR? And be honest! If the mechanism is likely not MAR, then more complicated measures have to be taken.
- Are there any derived terms, redundant variables, or irrelevant variables in the data set? Any of these types of variables will interfere with the regression process. Irrelevant variables—such as unique IDs—will not have any predictive power. Derived terms or redundant variables, such as having a column for weight in pounds and grams, or a column for an area in addition to a length and width column, will similarly interfere with the regression step.
- Convert all categorical variables to factors, ...