12 DISCRIMINANT ANALYSIS

In this chapter, we describe the method of discriminant analysis, which is a model‐based approach to classification. We discuss the main principle where classification is based on the distance of an observation from each of the class averages. We explain the underlying measure of “statistical distance,” which takes into account the correlation between predictors. The output of a discriminant analysis procedure generates estimated “discriminant functions,” which are then used to produce discriminant scores that can be translated into classifications or propensities (probabilities of class membership). Finally, we discuss the underlying model assumptions, the practical robustness to some, and the advantages of discriminant analysis when the assumptions are reasonably met (e.g., the sufficiency of a small training sample).

Discriminant analysis in JMP: The methods discussed in this chapter are available in the standard version of JMP. However, to compute validation statistics JMP Pro is required.

12.1 INTRODUCTION

Discriminant analysis is a classification method. Like logistic regression, it is a classical statistical technique that can be used for classification and profiling. It uses sets of measurements on different classes of records to classify new records into one of those classes (classification). Common uses of the method have been in classifying organisms into species and subspecies; classifying applications for loans, credit cards, and ...

Get Machine Learning for Business Analytics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.