Using Log-linear Models for Simple Contingency Tables

It is worth repeating these simple examples with a log-linear model so that when we analyse more complex cases you have a feel for what the GLM is doing. Recall that the deviance for a log-linear model of count data (p. 516) is

images

where O is a vector of observed counts and E is a vector of expected counts. Our first example had 29 males and 18 females and we wanted to know if the sex ratio was significantly male-biased:

observed<-c(29,18)
summary(glm(observed~1,poisson))

    Null deviance: 2.5985 on 1 degrees of freedom
Residual deviance: 2.5985 on 1 degrees of freedom
AIC: 14.547
Number of Fisher Scoring iterations: 4

Only the bottom part of the summary table is informative in this case. The residual deviance is compared to the critical value of chi-squared in tables with 1 d.f.:

1-pchisq(2.5985,1)

[1] 0.1069649

We accept the null hypothesis that the sex ratio is 50:50 (p = 0.106 96).

In the case of Mendel's peas we had a four-level categorical variable (i.e. four phenotypes) and the null hypothesis was a 9:3:3:1 distribution of traits:

observed<-c(315,101,108,32)

We need vectors of length 4 for the two seed traits, shape and colour:

shape<-factor(c("round","round","wrinkled","wrinkled"))
colour<-factor(c("yellow","green","yellow","green"))

Now we fit a saturated model (model1) and a model without the interaction term (model2) ...

Get The R Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.