Skip to Main Content
Graphing Data with R
book

Graphing Data with R

by John Jay Hilfiger
October 2015
Beginner to intermediate content levelBeginner to intermediate
250 pages
6h 26m
English
O'Reilly Media, Inc.
Content preview from Graphing Data with R

Chapter 13. High-Density Plots

Working with Large Datasets

Sometimes a large dataset can be a challenge when applying techniques such as scatter plots. Let’s consider one such dataset from the car package. Vocab contains more than 21,000 observations containing some basic demographic data and scores on a vocabulary test. Load the package and look at the data (be careful to use the head() command; you do not want to print the entire dataset!):

> library(car)
> attach(Vocab)
> head(Vocab)

         year    sex education vocabulary
20040001 2004 Female         9          3
20040002 2004 Female        14          6
20040003 2004   Male        14          9
20040005 2004 Female        17          8
20040008 2004   Male        14          1
20040010 2004   Male        14          7

It might be interesting to examine the relationship between vocabulary and education. Does it seem reasonable to expect that those with low education will have low vocabulary scores and that the scores will increase as amount of education increases? A scatter plot should make this clear. Here’s how to create it:

# Figure 13-1
library(car)
attach(Vocab)
plot(education, vocabulary)
detach(Vocab)

The scatter plot in Figure 13-1 is anything but clear! There is not a simple line or band of points showing the relationship we thought we would see. There is a little whitespace at the upper left and the lower right, but every other place looks equally populated.

A scatter plot of education and vocabulary.
Figure 13-1. A scatter plot of education and vocabulary

The two ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

A Course in Statistics with R

A Course in Statistics with R

Prabhanjan N. Tattar, Suresh Ramaiah, B. G. Manjunath

Publisher Resources

ISBN: 9781491922606Errata Page