Chapter 15. QQ Plots

Comparing Sets of Numbers

It can be quite useful to compare the distributions of two sets of numbers; for example, two variables or two vectors. The sets of numbers might both be sets of measurements, or one might be a theoretical distribution. For example, we might want to see how a particular variable compared to the theoretical “normal” distribution.

In the United States and many other parts of the world, it is customary for customers to leave a tip for people who perform services. Just how much to give is a topic of frequent discussion among patrons of restaurants. The reshape2 package includes a dataset, tips, that was compiled by a waiter about tips his own customers gave to him. Let’s take a look inside this interesting dataset:

> library(reshape2)
> attach(tips)
> head(tips)
  total_bill  tip    sex smoker day   time size
1      16.99 1.01 Female     No Sun Dinner    2
2      10.34 1.66   Male     No Sun Dinner    3
3      21.01 3.50   Male     No Sun Dinner    3
4      23.68 3.31   Male     No Sun Dinner    2
5      24.59 3.61 Female     No Sun Dinner    4
6      25.29 4.71   Male     No Sun Dinner    4

Now, we’ll try to learn more about the tip variable. First, how are the tips distributed? We could plot the density of tip to get an idea of that:

# Figure 15-1a
library(reshape2)
attach(tips)
par(mfrow = c(3,2))
plot(density(tip),
  main = "a. Density(tip)",
  col = "blue",
  lwd = 2)

The plot in Figure 15-1a shows that the distribution is quite skewed; that is, it has a long tail to the right. In other words, a few patrons give relatively ...

Get Graphing Data with R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Graphing Data with R by

Chapter 15. QQ Plots

Comparing Sets of Numbers

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly