Chapter 12. Unbiased Fair: For Data Science, It Cannot Be Just About the Math

Doug Hague

As I have been thinking through the ethical implications in data science, one thing has become glaringly obvious to me: data scientists like math! Nothing very surprising there. But as we go about our work building models and making great predictions, we tend to reduce the conversation about ethics to mathematical terms. Is my prediction for Caucasian Americans the same as for African Americans? Are female predictions equivalent to male ones? We develop confusion matrices and measure the accuracy of our predictions. Or maybe the sensitivity (true positive rate) or the specificity (true negative rate) is important, so we balance that for various subgroups. Unfortunately, mathematicians have shown that while we may be able to balance the accuracy, specificity, or other measures of bias for real datasets, we cannot balance them all and make perfectly unbiased models. So we do the best we can within the framework we are provided and declare that our model is fair.

After studying the issues and applications, I assert that models that balance bias are not fair. Fairness really does not pay attention to mathematics. It pays attention to individual viewpoints, societal and cultural norms, and morals. In other words, fairness is defined ...

Get 97 Things About Ethics Everyone in Data Science Should Know now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.