Using Log-linear Models for Simple Contingency Tables
It is worth repeating these simple examples with a log-linear model so that when we analyse more complex cases you have a feel for what the GLM is doing. Recall that the deviance for a log-linear model of count data (p. 516) is
where O is a vector of observed counts and E is a vector of expected counts. Our first example had 29 males and 18 females and we wanted to know if the sex ratio was significantly male-biased:
observed<-c(29,18) summary(glm(observed~1,poisson)) Null deviance: 2.5985 on 1 degrees of freedom Residual deviance: 2.5985 on 1 degrees of freedom AIC: 14.547 Number of Fisher Scoring iterations: 4
Only the bottom part of the summary table is informative in this case. The residual deviance is compared to the critical value of chi-squared in tables with 1 d.f.:
1-pchisq(2.5985,1)
[1] 0.1069649
We accept the null hypothesis that the sex ratio is 50:50 (p = 0.106 96).
In the case of Mendel's peas we had a four-level categorical variable (i.e. four phenotypes) and the null hypothesis was a 9:3:3:1 distribution of traits:
observed<-c(315,101,108,32)
We need vectors of length 4 for the two seed traits, shape and colour:
shape<-factor(c("round","round","wrinkled","wrinkled")) colour<-factor(c("yellow","green","yellow","green"))
Now we fit a saturated model (model1) and a model without the interaction term (model2) ...
Get The R Book now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.