Most of the graphs we have studied so far have been of quantitative variables. In a few cases, we have mixed quantitative and categorical variables, usually by making the distinct values of the categorical variable(s) define groups, each one having its own graph. Sometimes, however, all of the variables of interest are categorical. This requires special graphical methods.
Let’s consider a dataset in the
epicalc package. You will need to install this package, as well as
vcd, which includes some functions for working with categorical variables. Here’s how to do that:
> install.packages("epicalc") > install.packages("vcd") > library(epicalc) > library(vcd)
We will be looking at the
ANCdata dataset. You’ll need to get some information about this dataset:
This data is from a study of the types of care given to women with high-risk pregnancies in two clinics. There are three variables, all categorical, and each has only two values, or levels. We would like to know if perinatal mortality (i.e., a stillborn fetus or death of newborn within seven days) is related to the type of treatment or the clinic in which care was received. Let’s first look at the relationship between
anc (treatment). The
table() command shown in the following script will count the number of observations in each combination of the two variables:
# Table 20-1 library(epicalc) library(vcd) attach(ANCdata) xtab1 = table(death,anc) # make this ...