Chapter 3. Stable Transformations
In this chapter, you will learn about data transformations and how they will help you convert a non-private data analysis into a differentially private data analysis. Understanding if a data transformation is stable will help you identify whether your data analysis can be transformed into a DP data analysis.
Data transformations encompass any function from a data set to a data set. In the context of transformations, consider a data set to be any form of data that has not been made private. Transformations are mathematical abstractions that represent any manipulations, modifications, and computations performed on a data set.
Non-private data analysis pipelines can typically be broken down into three distinct phases: data preprocessing, a statistical query, and postprocessing. In the pipeline shown in Figure 3-1, data passes sequentially through each phase.
Figure 3-1. A data processing pipeline from both the non-DP perspective and the DP perspective
In a non-DP context, preprocessing consists of any modifications you may make to microdata. Microdata is a data set where each row corresponds to data from one individual.
An example is a function that modifies each record in a data set, one at a time, while preserving the dimensions of the data. Consider a preprocessing that doubles the value of each numeric element in a column. In differential privacy, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access