CHAPTER 7 Models for Count Data

Many response variables have counts as their possible outcomes. Examples are the number of alcoholic drinks you had in the previous week, and the number of devices you own that can access the internet (laptops, smart cell phones, tablets, etc.). Counts also occur as entries in cells of contingency tables that cross-classify categorical variables, such as the number of people in a survey who are female, college educated, and agree that humans are responsible for climate change. In this chapter we introduce generalized linear models (GLMs) for count response variables.

Section 7.1 presents models that assume a Poisson distribution for a count response variable. The loglinear model, using a log link to connect the mean with the linear predictor, is most common. The model can be adapted to model a rate when the count is based on an index such as space or time. Section 7.2 shows how to use Poisson and related multinomial models for contingency tables to analyze conditional independence and association structure for a multivariate categorical response variable. For the Poisson distribution, the variance must equal the mean, and data often exhibit greater variability than this. Section 7.3 introduces GLMs that assume a negative binomial distribution, which handles such overdispersion in a natural way. Many datasets show greater frequencies of zero counts than standard models allow, often because some subjects can have a zero outcome by chance but some ...

Get Foundations of Linear and Generalized Linear Models now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.