♣16♣Anonymous Data

Before we can start the real work of manipulating data in order to gain more information from it, there might be the need to reduce the information content first and anonymise it. This step should be done before anything else, otherwise copies of sensitive data can be lingering around and will be found by people who should not have access to it. The golden rule to remember is: only take data that you might need and get rid of confidential or personal data as soon as possible in the process.

It is best to anonymise data even before importing it in R. Typically, it will sit in an SQL database and then it is easy to scramble before the data even leaves this database:

Get The Big R-Book now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.