Skip to Content
97 Things About Ethics Everyone in Data Science Should Know
book

97 Things About Ethics Everyone in Data Science Should Know

by Bill Franks
August 2020
Beginner
344 pages
10h 23m
English
O'Reilly Media, Inc.
Content preview from 97 Things About Ethics Everyone in Data Science Should Know

Chapter 30. Anonymizing Data Is Really, Really Hard

Damian Gordon

Data analytics holds the promise of a more profound and complete understanding of the world around us. Many have claimed that because of the present-day ubiquity of data, it has become possible to finally automate everything from value creation to organizational adaptability. To achieve this, large quantities of data about people (and their behaviors) are required. But there is a balance to be struck between the need for this very detailed data and the rights of individuals to maintain their privacy. One approach to dealing with this challenge is to remove some of the key identifiers from a dataset, sometimes called the “name data,” which typically includes fields such as Name, Address, and Social Security Number. Those are the features that would appear to be the key characteristics that uniquely identify an individual. Unfortunately, there is a wide range of techniques that allows others to de-anonymize such data.

Some datasets can be de-anonymized by very rudimentary means; for example, some individuals in a dataset of anonymous movie reviews were identified simply by searching for similarly worded reviews on websites that are not anonymous—IMDB, for example. In another case, AOL released a list of 20 million web search queries it had collected, and two ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

This is Technology Ethics

This is Technology Ethics

Sven Nyholm, Steven D. Hales
Becoming a Data Head

Becoming a Data Head

Alex J. Gutman, Jordan Goldmeier
Data Quality Fundamentals

Data Quality Fundamentals

Barr Moses, Lior Gavish, Molly Vorwerck

Publisher Resources

ISBN: 9781492072652Errata Page