10 LOGISTIC REGRESSION

In this chapter, we describe the highly popular and powerful classification method called logistic regression. Like linear regression, it relies on a specific model relating the predictors with the outcome. The user must specify the predictors to include and their form (e.g., including any interaction terms). This means that even small datasets can be used for building logistic regression classifiers and that, once the model is estimated, it is computationally fast and cheap to classify even large samples of new records. We describe the logistic regression model formulation and its estimation from data. We also explain the concepts of “logit,” “odds,” and “probability” of an event that arise in the logistic model context and the relations among the three. We discuss variable importance and coefficient interpretation, variable selection for dimension reduction, and extensions to multi‐class classification.

Logistic Regression in JMP: Logistic regression models can be fit using the standard version of JMP. However, to compute validation statistics using a validation column or to fit regularized regression models, JMP Pro is required.

10.1 INTRODUCTION

Logistic regression extends the ideas of linear regression to the situation where the outcome variable Y, is categorical. We can think of a categorical variable as dividing the records into classes. For example, if Y denotes a recommendation on holding/selling/buying a stock, we have a categorical variable ...

Get Machine Learning for Business Analytics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.