O'Reilly logo

The R Book by Michael J. Crawley

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Plots for Single Samples

When we have a just one variable, the choice of plots is more restricted:

  • histograms to show a frequency distribution;
  • index plots to show the values of y in sequence;
  • time-series plots;
  • compositional plots like pie diagrams

Histograms

The commonest plots for a single sample are histograms and index plots. Histograms are excellent for showing the mode, the spread and the symmetry (skew) of a set of data. The R function hist is deceptively simple. Here is a histogram of 1000 random points drawn from a Poisson distribution with a mean of 1.7:

hist(rpois(1000,1.7),
           main="",xlab="random numbers from a Poisson with mean 1.7")

images

This illustrates perfectly one of the big problems with histograms: it is not clear what the bar of height 500 is showing. Is it the frequency of zeros, or the frequency of zeros and ones lumped together? What we really want in a case like this is a separate histogram bar for each integer value from 0 to 8. We achieve this by specifying the breaks on the x axis to be at −0.5, 0.5, 1.5, ... , like this:

hist(rpois(1000,1.7),breaks=seq(-0.5,9.5,1),
          main="",xlab="random numbers from a Poisson with mean 1.7")

That's more like it. Now we can see that the mode is 1 (not 0), and that 2s are substantially more frequent than 0s. The distribution is said to be ‘skew to the right’ (or ‘positively skew’) because the long tail is on the right-hand ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required