Chapter 3. Bar Graphs

Bar graphs are perhaps the most commonly used kind of data visualization. They’re typically used to display numeric values (on the y-axis), for different categories (on the x-axis). For example, a bar graph would be good for showing the prices of four different kinds of items. A bar graph generally wouldn’t be as good for showing prices over time, where time is a continuous variable—though it can be done, as we’ll see in this chapter.

There’s an important distinction you should be aware of when making bar graphs: sometimes the bar heights represent counts of cases in the data set, and sometimes they represent values in the data set. Keep this distinction in mind—it can be a source of confusion since they have very different relationships to the data, but the same term is used for both. In this chapter I’ll discuss this more, and present recipes for both types of bar graphs.

From this chapter on, this book will focus on using ggplot2 instead of base R graphics. Using ggplot2 will both keep things simpler and make for more sophisticated graphics.

3.1 Making a Basic Bar Graph

Problem

You have a data frame where one column represents the x position of each bar, and another column represents the vertical (y) height of each bar.

Solution

Use ggplot() with geom_col() and specify what variables you want on the x- and y-axes (Figure 3-1):

library(gcookbook)  # Load gcookbook for the pg_mean data set
ggplot(pg_mean, aes(x = group, y = weight)) +
  geom_col()
Figure ...

Get R Graphics Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.