In this episode of the Data Show, I spoke with Aurélie Pols of Mind Your Privacy, one of my go-to resources when it comes to data privacy and data ethics. This interview took place at Strata Data London, a couple of days before the EU General Data Protection Regulation (GDPR) took effect. I wanted her perspective on this landmark regulation, as well as her take on trends in data privacy and growing interest in ethics among data professionals.
Here are some highlights from our conversation:
GDPR is just the starting point
GDPR is not an end point. It's a starting point for a journey where a balance between companies and society and users of data needs to be redefined. Because when I look at my children, I look at how they use technology, I look at how smart my house might become or my car or my fridge, I know that in the long run this idea of giving consent to my fridge to share data is not totally viable. What are we going to be build for the next generations?
... I've been teaching privacy and ethics in Madrid at the IE Business School, one of the top business schools in the world. I’ve been teaching in the big data and analytics graduate program. I see the evolution as well. Five years ago, they looked at me like, 'What is she talking about?' Three years ago, some of the people in the room started to understand. ... Last year it was like 'We get it.'
Privacy by design
It's defined as data protection by design and by default as well. The easy part is more the default settings: when you create systems, it's the question I ask 20 times a week: 'Great. I love your system. What data do you collect by default and what do you pass on by default?' Then you start turning things off and then we'll see who takes on the responsibility to turn things on again. That's a default. Privacy by design was pushed by Ann Cavoukian from Ottawa in Canada more than 10 years ago.
These principles are finding themselves within the legislation. Not only in GDPR—for example, Hong Kong is starting to talk about this and Japan as well. One of these principles is about positive-sum, not zero-sum. It's not 'I win and you lose.' It's 'we work together and we both win.' That's a very good principle.
There are interesting challenges within privacy by design to translate these seven principles into technical requirements. I think there are opportunities as well. It talks about traceability, visibility, transparency. Which then comes back again to, we're sitting on so much data; how much data do we want to surface and are data subjects or citizens ready to understand what we have, and are they able to make decisions based on that? ... Hopefully this generation of more ethically minded engineers or data scientists will start thinking in that way as well.
"The data subject first?": Aurélie Pols draws a broad philosophical picture of the data ecosystem and then hones in on the right to data portability.
"Managing risk in machine learning models": Andrew Burt and Steven Touw on how companies can manage models they cannot fully explain.
"The real value of data requires a holistic view of the end-to-end data pipeline": Ashok Srivastava on the emergence of machine learning and AI for enterprise applications
- "Bringing AI into the enterprise": Kris Hammond on business applications of AI technologies and educating future AI specialists.