Chapter 3. Bar Graphs

Bar graphs are perhaps the most commonly used kind of data visualization. They’re typically used to display numeric values (on the y-axis), for different categories (on the x-axis). For example, a bar graph would be good for showing the prices of four different kinds of items. A bar graph generally wouldn’t be as good for showing prices over time, where time is a continuous variable—though it can be done, as we’ll see in this chapter.

There’s an important distinction you should be aware of when making bar graphs: sometimes the bar heights represent counts of cases in the data set, and sometimes they represent values in the data set. Keep this distinction in mind—it can be a source of confusion since they have very different relationships to the data, but the same term is used for both of them. In this chapter I’ll discuss this more, and present recipes for both types of bar graphs.

Making a Basic Bar Graph

Problem

You have a data frame where one column represents the x position of each bar, and another column represents the vertical (y) height of each bar.

Solution

Use ggplot() with geom_bar(stat="identity") and specify what variables you want on the x- and y-axes (Figure 3-1):

library(gcookbook) # For the data set
ggplot(pg_mean, aes(x=group, y=weight)) + geom_bar(stat="identity")
Bar graph of values (with stat="identity"), with a discrete
            x-axis
Figure 3-1. Bar graph of values (with stat="identity”) with a discrete x-axis

Discussion ...

Get R Graphics Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.