Chapter 43. Imbalance of Factors Affecting Societal Use of Data Science

Nenad Jukić

It is obvious that there is no data science without data. For societal improvements, the most useful data is generated by and/or based on humans. There are many factors that affect the use of such data, such as the need for privacy, the motivation for analysis, and the benefits of shared collective data. For the purpose of this essay, we will assume that the term “shared data” represents the data that combines data entries produced by and/or based on multiple humans, as opposed to data that represents one individual.

Often in contemporary discussions about human-related data, the issue of privacy is perceived as its own issue, more worthy of attention than other relevant issues. This state of affairs, where privacy is considered in isolation from all other issues, is an obstacle to progress in the societally beneficial use of data. There is no doubt that serious improvements in areas such as health care, education, and environmental protection, among others, would be possible if skilled data scientists with good intentions had access to more relevant data.

The issues of privacy and the issues of the use of shared data are treated very differently, depending on the motivation for data analysis as well as the ability to leverage the privilege ...

Get 97 Things About Ethics Everyone in Data Science Should Know now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.