Chapter 16Cost-Benefit Analysis Using Data-Driven Costs
In Chapter 15, we were introduced to cost-benefit analysis and misclassification costs. Our goal in this chapter is to derive a methodology whereby the data itself teaches us what the misclassification costs should be; that is, cost-benefit analysis using data-driven misclassification costs. Before we can perform that, however, we must turn to a more systematic treatment of misclassification costs and cost-benefit tables, deriving the following three important results regarding misclassification costs and cost-benefit tables1
:
- Decision invariance under row adjustment
- Positive classification criterion
- Decision invariance under scaling.
16.1 Decision Invariance Under Row Adjustment
For a binary classifier, define
to be the confidence (to be defined later) of the model for classifying a data record as i = 0 or i = 1. For example,
represents the confidence that a given classification algorithm has in classifying a record as positive (1), given the data.
is also called the posterior probability of a given classification. By way of contrast, would represent the prior probability of a given classification; that is, the proportion ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access