CHAPTER 5 Models for Binary Data

For binary responses, analysts usually assume a binomial distribution for the random component of a generalized linear model (GLM). From its exponential dispersion representation (4.6) in Section 4.1.2, the binomial natural parameter is the log odds, the so-called logit. The canonical link function for binomial GLMs is the logit, for which the model itself is referred to as logistic regression. This is the most important model for binary response data and has been used for a wide variety of applications. Early uses were in biomedical studies, for instance to model the effects of smoking, cholesterol, and blood pressure on the presence or absence of heart disease. The past 25 years have seen of substantial use in social science research for modeling opinions (e.g., favor or oppose legalization of same-sex marriage) and behaviors, in marketing applications for modeling consumer decisions (e.g., a choice between two products), and in finance for modeling credit-related outcomes (e.g., whether a credit card bill is paid on time).

In this chapter we focus on logistic regression and other models for binary response data. Section 5.1 presents some link functions and a latent variable model that motivates particular cases. Section 5.2 shows properties of logistic regression models and interprets its parameters. In Section 5.3 we apply GLM methods to specify likelihood equations and then conduct inference based on the logistic regression model. Section ...

Get Foundations of Linear and Generalized Linear Models now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.