O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Imputing missing values using the 'mice' package

In this example we will use the mice package to impute some missing values for the age variable in the all.df dataframe. The value of the age variable will be imputed by two other existing variables: gender and education.

To begin, install and load the mice package:

install.packages("mice") library(mice)  

We will now run the md.pattern() function, which will show you the distribution of the missing values over the other columns in the dataframe. The md.pattern() function output is useful for suggesting which variables might be good candidates to use for imputing the missing values:

md.pattern(all.df) 

The output from md.pattern() function is shown later. Each row shows a count of observation ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required