Chapter 12. Visualizing Associations Among Two or More Quantitative Variables

Many datasets contain two or more quantitative variables, and we may be interested in how these variables relate to each other. For example, we may have a dataset of quantitative measurements of different animals, such as the animals’ height, weight, length, and daily energy demands. To plot the relationship of just two such variables, such as the height and weight, we will normally use a scatterplot. If we want to show more than two variables at once, we may opt for a bubble chart, a scatterplot matrix, or a correlogram. Finally, for very high-dimensional datasets, it may be useful to perform dimension reduction, for example in the form of principal components analysis.

Scatterplots

I will demonstrate the basic scatterplot and several variations thereof using a dataset of measurements performed on 123 blue jay birds. The dataset contains information such as the head length (measured from the tip of the bill to the back of the head), the skull size (head length minus bill length), and the body mass of each bird. We expect that there are relationships between these variables. For example, birds with longer bills would be expected to have larger skull sizes, and birds with higher body mass should have larger bills and skulls than birds with lower body mass.

To explore these relationships, I begin with a plot of head length against body mass (Figure 12-1). In this plot, head length is shown along the ...

Get Fundamentals of Data Visualization now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.