Categorical and Dummy Variables in Regression Models

SERGIO M. FOCARDI, PhD

Partner, The Intertek Group

FRANK J. FABOZZI, PhD, CFA, CPA

Professor of Finance, EDHEC Business School

Abstract: In the application of regression analysis there are many situations where either the dependent variable or one or more of the regressors are categorical variables. When one or more categorical variables are used as regressors, a financial modeler must understand how to code the data, test for the significance of the categorical variables, and, based on the coding, how to interpret the estimated parameters. When the dependent variable is a categorical variable, the model is a probability model.

There are many times in the application of regression analysis when the financial modeler will need to include a categorical variable rather than a continuous variable as a regressor. Categorical variables are variables that represent group membership. For example, given a set of bonds, the rating is a categorical variable that indicates to what category—AA, BB, and so on—each bond belongs. A categorical variable does not have a numerical value or a numerical interpretation in itself. Thus the fact that a bond is in category AA or BB does not, in itself, measure any quantitative characteristic of the bond, though quantitative attributes such as a bond’s yield spread can be associated with each category.

In this entry, we will discuss how to deal with regressors that are categorical variables in a ...

Get Encyclopedia of Financial Models, 3 Volume Set now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.