CHAPTER 5Model Interpretability: The What and the Why

In this chapter, we will establish why understanding model interpretability and how to achieve or approximate it is a key step in discovering the circumstances in which a model might unexpectedly generate harmful predictions. We begin this chapter by presenting a real-world example that motivates the need for humans to better understand the operation of complex black-box models. We then follow that example by presenting model interpretability as a way to diagnose issues of fairness in models that may have otherwise produced harmful predictions in the future had they not been addressed. Over the remainder of the chapter, we will discuss the reasons why model interpretability (or the lack of it) is necessary for responsible data science as well as how black-box models, which are otherwise uninterpretable to humans, can be made to generate meaningful explanations via interpretability methods.

The Sexist Résumé Screener

Issues in models are often discovered after deployment by the very users or clients they are intended to serve. The history of harms caused by statistical methods, both unintentionally and intentionally, ought to have provided data scientists ample guidance on avoiding them. However, even multinational corporations on the forefront of data science progress continue to embroil themselves in controversies caused by their harmful models. Why does this cycle keep repeating itself? And what can be done now and, ...

Get Responsible Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.