To help us think seriously about data ethics, we need case studies that we can discuss, argue about, and come to terms with as we engage with the real world. Good case studies give us the opportunity to think through problems before facing them in real life. And case studies show us that ethical problems aren't simple. They are multi-faceted, and frequently there's no single right answer. And they help us to recognize there are few situations that don't raise ethical questions.
Princeton's Center for Information Technology Policy and Center for Human Values have created four anonymized case studies to promote the discussion of ethics. The first of these studies, Automated Healthcare App, discusses a smartphone app designed to help adult onset diabetes patients. It raises issues like paternalism, consent, and even language choices. Is it OK to “nudge” patients toward more healthy behaviors? What about automatically moderating the users’ discussion groups to emphasize scientifically accurate information? And how do you deal with minorities who don’t respond to treatment as well? Could the problem be the language itself that is used to discuss treatment?
The next case study, Dynamic Sound Identification, covers an application that can identify voices, raising issues about privacy, language, and even gender. How far should developers go in identifying potential harm that can be caused by an application? What are acceptable error rates for an application that can potentially do harm? How can a voice application handle people with different accents or dialects? And what responsibility do developers have when a small experimental tool is bought by a large corporation that wants to commercialize it?
The Optimizing Schools case study deals with the problem of finding at-risk children in school systems. Privacy and language are again an issue; it also raises the issue of how decisions to use data are made. Who makes those decisions, and who needs to be informed about them? What are the consequences when people find out how their data has been used? And how do you interpret the results of an experiment? Under what conditions can you say that a data experiment has really yielded improved educational results?
The final case study, Law Enforcement Chatbots, raises issues about the tradeoff between liberty and security, entrapment, openness and accountability, and compliance with international law.
None of these issues are simple, and there are few (if any) "right answers." For example, it’s easy to react against perceived paternalism in a medical application, but the purpose of such an application is to encourage patients to comply with their treatment program. It’s easy to object to monitoring students in a public school, but students are minors, and schools by nature handle a lot of private personal data. Where is the boundary between what is, and isn’t, acceptable? What's important isn’t getting to the correct answer on any issue, but to make sure the issue is discussed and understood, and that we know what tradeoffs we are making. What is important is that we get practice in discussing ethical issues and put that practice to work in our jobs. That’s what these case studies give us.