Sometimes, it can be helpful to look at summary information about a group of numbers instead of the numbers themselves. One type of graph that does this by breaking the data into well-defined ranges of numbers is the box plot. We will try this graph on a relatively large dataset, one with which our previous types of graphs do not work very well.
There are some interesting datasets in the
nlme package. Get this package and load it by using these commands:
> install.packages("nlme") > library(nlme)
Next, take a look at the
MathAchieve dataset. With more than 7,000 rows, this is much larger than the datasets we have dealt with previously. What problems will this create for us if we want to examine the distribution of
MathAch scores? Let’s see what happens with a strip chart of this data.
In the code that produces Figure 5-1, as in many following examples, the
mfrow argument is used in
par() to make multiple graphs appear on one page. The format is
mfrow = c(i,j), where
i is the number of rows of graphs and
j is the number of columns:
# Figure 5-1 library(nlme) par(mfrow=c(2,1)) # set up one graph above another: 2 rows/1 col stripchart(MathAchieve$MathAch, method = "jitter", main = "a. Math Ach Scores, pch = '19'", xlab = "Scores", pch = "19") stripchart(MathAchieve$MathAch, method = "jitter", main = "b. Math Ach Scores, pch = '.'", xlab = "Scores", pch = ".")
These strip charts show the results ...