Get full access to Data Analysis with R - Second Edition and 60K+ other titles, with a free 10-day trial of O'Reilly.

There are also live events, courses curated by job role, and more.

Visualizing missing data

In order to demonstrate the visualizing patterns of missing data, we first have to create some missing data. This will also be the same dataset that we perform analysis on later in the chapter. To showcase how to use multiple imputation in a semi-realistic scenario, we are going to create a version of the mtcars dataset with a few missing values:

Okay, let's set the seed (for deterministic randomness), and create a variable to hold our new marred dataset, using the following code:

set.seed(2) 
miss_mtcars <- mtcars

First, we are going to create seven missing values in drat (about 20 percent), five missing values in the mpg column (about 15 percent), five missing values in the cyl column, three missing values in

Get Data Analysis with R - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Don’t leave empty-handed

Get Mark Richards’s Software Architecture Patterns ebook to better understand how to design components—and how they should interact.

It’s yours, free.

Get it now

Check it out now on O’Reilly

Dive in for free with a 10-day trial of the O’Reilly learning platform—then explore all the other resources our members count on to build skills and solve problems every day.

Start your free trial Become a member now