Substituting missing values using the mice package

Finding and removing missing values in your dataset is not always a viable alternative, for either operative or methodological reasons. It is often preferable to simulate possible values for missing data and integrate those values within the observed data.

This recipe is based on the mice package by Stef van Buuren. It provides an efficient algorithm for missing value substitution based on the multiple imputation technique.

Note

Multiple imputation technique

The multiple imputation technique is a statistical solution to the problem of missing values.

The main idea behind this technique is to draw possible alternative values for each missing value and then, after a proper analysis of simulated values, ...

Get RStudio for R Statistical Computing Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.