21 RESPONSIBLE DATA SCIENCE

In this chapter, we go beyond technical considerations of model fitting, selection, and performance and discuss the potentially harmful effects of machine learning. The catalog of harms is now extensive, including a host of cases where AI has deliberately been put to ill purposes in service of big brother surveillance and state suppression of minorities. Our focus, however, is on cases where the intentions of the model developer are good, and the resulting bias or unfairness has been unintentional. We review the principles of responsible data science (RDS) and discuss a concrete framework that can govern data science work to put those principles into practice. We also discuss some key elements of that framework: datasheets, model cards, and model audits.

21.1 INTRODUCTION1

Machine learning and AI bring the promise of seemingly unlimited good. After all, the ability to ingest any set of arbitrarily sized, minimally structured data and produce predictions or explanations for these data is applicable to almost every domain. Our societal attention often focuses on the revolutionary future applications of this potential: cars that drive themselves, computers that can hold natural conversations with humans, precision medications tailored to our specific genomes, cameras that can instantly recognize any object, and software that can automatically generate new images or videos. Conversations about these benefits, though, too often ignore the harms that ...

Get Machine Learning for Business Analytics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.