With two variables (typically the *response variable* on the *y* axis and the *explanatory variable* on the *x* axis), the kind of plot you should produce depends upon the nature of your explanatory variable. When the explanatory variable is a continuous variable, such as length or weight or altitude, then the appropriate plot is a **scatterplot**. In cases where the explanatory variable is categorical, such as genotype or colour or gender, then the appropriate plot is either a **box-and-whisker plot** (when you want to show the scatter in the raw data) or a **barplot** (when you want to emphasize the effect sizes).

The most frequently used plotting functions for two variables in R are the following:

• plot(x,y) | scatterplot of y against x |

• plot(factor, y) | box-and-whisker plot of y at levels of factor |

• barplot(y) | heights from a vector of y values |

The plot function draws axes and adds a scatterplot of points. Two extra functions, points and lines, *add* extra points or lines to an *existing* plot. There are two ways of specifying plot, points and lines and you should choose whichever you prefer:

• Cartesian | plot(x,y) |

• formula | plot(y~x) |

The advantage of the formula-based plot is that the plot function and the model fit look and feel the same (response variable, tilde, explanatory variable). If you use Cartesian plots (eastings first, then northings, like the grid reference on a map) then the plot ...

Start Free Trial

No credit card required