Chapter 2. Implementing Data Synthesis

The first decision to be made is whether data synthesis is the best approach for providing data access, compared to alternative privacy-enhancing technologies (PETs). To ensure success with implementing synthesis, it must be aligned with an organization’s priorities. In this chapter we first present a decision framework that will enable the objective selection of data synthesis and help you decide when it will fit business priorities, compared to alternatives.

Once data synthesis is selected as the appropriate solution, we can consider the implementation process.

There are two key components to the implementation of data synthesis at the enterprise level: the process and the structure. The process consists of the key process steps, and demonstrates how to integrate synthesis into a data pipeline. Structure would typically be operationalized through a Synthesis Center of Excellence1 that would have dedicated skills and capacity to generate data for the organization and its customers, as well as provide education and consulting on data synthesis to the rest of the organization. This chapter describes the process and structure in some detail to provide guidance and describe the critical success factors.

In practice, there are many possible scenarios where data synthesis capabilities will need to be deployed. For example, there will be large organizations as well as solo practitioners. Therefore, the following descriptions will need to be tailored ...

Get Practical Synthetic Data Generation now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.