Chapter 1

Analysis of Over- and Underdispersed Data

Elizabeth Juarez-Colunga and C. B. Dean

1.1 Introduction

In the analysis of discrete data, for example, count data analyzed under a Poisson model, or binary data analyzed under a binomial model quite often the empirical variance exceeds the theoretical variance under the presumed model. This phenomenon is called overdispersion. If overdispersion is ignored, standard errors of parameter estimates will be underestimated, and therefore p-values for tests and hypotheses will be too small, leading to incorrectly declaring a predictor as significant when in fact it may not be.

The Poisson and binomial distributions are simple models but have strict assumptions. In particular, they assume a special mean-variance relationship since each of these distributions is determined by a single parameter. On the other hand, the normal distribution is determined by two parameters, the mean μ and variance σ2, which characterize the location and the spread of the data around the mean. In both the Poisson and binomial distributions, the variance is fixed once the mean or the probability of success has been defined.

Hilbe [25] provides a very comprehensive discussion of what he calls apparent overdispersion, which refers to scenarios in which the data exhibit variation beyond what can be explained by the model and this lack of fit is due to several “fixable” reasons. These reasons may be omitting important predictors in the model, the presence of outliers, ...

Get Methods and Applications of Statistics in Clinical Trials, Volume 2: Planning, Analysis, and Inferential Methods now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.