Chapter 18. Coplots (Conditioning Plots)

The Coplot

Sometimes, the apparent relationship between two variables can be quite misleading. This may well be due to a strong association that one or both variables have to a third variable. Consider the States dataset from the car package. This is data about the SAT exam, a test that many students in the United States take as part of the college admissions process. States also contains several other variables about secondary education on the state level in 1992. Each of the 51 observations in this dataset represents one state, or the District of Columbia. Figure 18-1 shows a scatter plot of average scores on the SATM, the math subtest of the SAT, against the amount of money (in thousands of dollars per student) spent on public education in each state.

A scatter plot of the state average SATM score and the average amount of state spending on public education, per student, in thousands of dollars. There are 51 points, one for each state and the District of Columbia.
Figure 18-1. A scatter plot of the state average SATM scores and the average amount of state spending on public education, per student, in thousands of dollars. There are 51 points, one for each state and the District of Columbia.

Here is the code to produce Figure 18-1:

# Figure 18-1
library(car)
attach(States)
plot(dollars,SATM,
 pch = 16,
 col = "maroon")
grid(lty = "solid")

Figure 18-1 seems to indicate that states that spent relatively little on education had high SATM scores, whereas higher-spending states had relatively low scores. This is completely counterintuitive! We would expect—or ...

Get Graphing Data with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.