O'Reilly logo

Data Mining and Predictive Analytics, 2nd Edition by Daniel T. Larose, Chantal D. Larose

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 16Cost-Benefit Analysis Using Data-Driven Costs

In Chapter 15, we were introduced to cost-benefit analysis and misclassification costs. Our goal in this chapter is to derive a methodology whereby the data itself teaches us what the misclassification costs should be; that is, cost-benefit analysis using data-driven misclassification costs. Before we can perform that, however, we must turn to a more systematic treatment of misclassification costs and cost-benefit tables, deriving the following three important results regarding misclassification costs and cost-benefit tables1

:

  • Decision invariance under row adjustment
  • Positive classification criterion
  • Decision invariance under scaling.

16.1 Decision Invariance Under Row Adjustment

For a binary classifier, define c16-math-0001 to be the confidence (to be defined later) of the model for classifying a data record as i = 0 or i = 1. For example, c16-math-0002 represents the confidence that a given classification algorithm has in classifying a record as positive (1), given the data. c16-math-0003 is also called the posterior probability of a given classification. By way of contrast, would represent the prior probability of a given classification; that is, the proportion ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required