Chapter 20. Mosaic Plots
Graphing Categorical Data
Most of the graphs we have studied so far have been of quantitative variables. In a few cases, we have mixed quantitative and categorical variables, usually by making the distinct values of the categorical variable(s) define groups, each one having its own graph. Sometimes, however, all of the variables of interest are categorical. This requires special graphical methods.
Let’s consider a dataset in the epiDisplay
package. You will need to install this package, as well as vcd
, which includes some functions for working with categorical variables. Here’s how to do that:
> install.packages("epiDisplay") > install.packages("vcd") > library(epiDisplay) > library(vcd)
We will be looking at the ANCdata
dataset. You’ll need to get some information about this dataset:
> ?ANCdata
This data is from a study of the types of care given to women with high-risk pregnancies in two clinics. There are three variables, all categorical, and each has only two values, or levels. We would like to know if perinatal mortality (i.e., a stillborn fetus or death of newborn within seven days) is related to the type of treatment or the clinic in which care was received. Let’s first look at the relationship between death
and anc
(treatment). The table()
command shown in the following script will count the number of observations in each combination of the two variables:
# Table 20-1 library(epiDisplay) library(vcd) attach(ANCdata) xtab1 = table(death,anc) ...
Get Graphing Data with R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.