Chapter 3 Regression with Categorical Outcome Variables

Linear regression is one of the most widely used (and understood) statistical techniques. However, its typical use involves situations in which the outcome variable is continuous. Many situations in data analysis involve predicting the value of a nominal or an ordinal categorical outcome variable. For example, we may want to predict whether a student passes or fails a course, or we may want to predict a person’s level of product satisfaction. In situations in which you have a nominal categorical outcome variable, researchers quite often use either binary or multinominal logistic regression.

In addition, regression is normally associated with the idea of having continuous predictor variables, so that we need to create dummy variables to represent categorical variables in a regression model. This, however, can become unmanageable when we have many categorical variables.

When researchers have an ordinal categorical outcome variable, they typically use either linear regression or logistic regression (in both cases ignoring the level of measurement of the variable). In these situations, it would be more effective to leave the variables in their original categories, yet still use them directly in regression, because with a categorical dependent variable, the linear regression assumptions are violated, and the results may be poor or, for nominal variables, meaningless.

Although logistic regression is certainly an extremely useful ...

Get SPSS Statistics for Data Analysis and Visualization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.