Skip to Content
Practical Synthetic Data Generation
book

Practical Synthetic Data Generation

by Khaled El Emam, Lucy Mosquera, Richard Hoptroff
May 2020
Beginner to intermediate
163 pages
4h 31m
English
O'Reilly Media, Inc.
Content preview from Practical Synthetic Data Generation

Chapter 2. Implementing Data Synthesis

The first decision to be made is whether data synthesis is the best approach for providing data access, compared to alternative privacy-enhancing technologies (PETs). To ensure success with implementing synthesis, it must be aligned with an organization’s priorities. In this chapter we first present a decision framework that will enable the objective selection of data synthesis and help you decide when it will fit business priorities, compared to alternatives.

Once data synthesis is selected as the appropriate solution, we can consider the implementation process.

There are two key components to the implementation of data synthesis at the enterprise level: the process and the structure. The process consists of the key process steps, and demonstrates how to integrate synthesis into a data pipeline. Structure would typically be operationalized through a Synthesis Center of Excellence1 that would have dedicated skills and capacity to generate data for the organization and its customers, as well as provide education and consulting on data synthesis to the rest of the organization. This chapter describes the process and structure in some detail to provide guidance and describe the critical success factors.

In practice, there are many possible scenarios where data synthesis capabilities will need to be deployed. For example, there will be large organizations as well as solo practitioners. Therefore, the following descriptions will need to be tailored ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Architecting Data and Machine Learning Platforms

Architecting Data and Machine Learning Platforms

Marco Tranquillin, Valliappa Lakshmanan, Firat Tekiner
Architecting Modern Data Platforms

Architecting Modern Data Platforms

Jan Kunigk, Ian Buss, Paul Wilkinson, Lars George
Data Mesh

Data Mesh

Zhamak Dehghani
Practical Natural Language Processing

Practical Natural Language Processing

Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, Harshit Surana

Publisher Resources

ISBN: 9781492072737Errata Page