CHAPTER 3Alternative Data Risks and Challenges


Recently new legislation, like the EU General Data Protection Regulation (GDPR),1 has been enacted. The aim of GDPR is to protect all EU citizens from privacy and data breaches and to give them control over their personal data. Hence, GDPR is already impacting how investors can obtain and use alternative data in those cases where data contains what is possibly considered the personal data of individuals in the European Union. Indeed, many alternative datasets contain personal information (e.g. credit card panel data and location). Therefore, their usage for investing must be always preceded by some due diligence checks.

Let's first more rigorously define what GDPR defines as “personal data.” It is different and broader than the US definition of “personally identifiable information” (PII). In the EU, a key question to ask when defining “personal data” is whether a person can be identified based on that data. This means whether it is possible to reverse-engineer the data, maybe by combining it with other data sources, and to be able to uniquely identify that person. Hence, according to the European Commission definition, For data to be truly anonymized, the anonymization must be irreversible.” For example, if the name was removed from a dataset of individuals but the address remained, it would be fairly straightforward to derive the name (or least narrow it down to a household) by joining with a dataset ...

Get The Book of Alternative Data now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.