Chapter 16Cost-Benefit Analysis Using Data-Driven Costs

In Chapter 15, we were introduced to cost-benefit analysis and misclassification costs. Our goal in this chapter is to derive a methodology whereby the data itself teaches us what the misclassification costs should be; that is, cost-benefit analysis using data-driven misclassification costs. Before we can perform that, however, we must turn to a more systematic treatment of misclassification costs and cost-benefit tables, deriving the following three important results regarding misclassification costs and cost-benefit tables1

:

• Decision invariance under row adjustment
• Positive classification criterion
• Decision invariance under scaling.

16.1 Decision Invariance Under Row Adjustment

For a binary classifier, define to be the confidence (to be defined later) of the model for classifying a data record as i = 0 or i = 1. For example, represents the confidence that a given classification algorithm has in classifying a record as positive (1), given the data. is also called the posterior probability of a given classification. By way of contrast, would represent the prior probability of a given classification; that is, the proportion ...

