Chapter 3. Synthetic Data Case Studies

While the technical concepts behind the generation of synthetic data have been around for a few decades, their practical use has picked up only quite recently. One reason is that this type of data solves some challenging problems that were quite hard to solve before, or solves them in a more cost-effective way. All of these problems pertain to data access: sometimes it is just hard to get access to real data.

In this chapter, we present a few application examples from various industries. These examples are not intended to be exhaustive but rather illustrative. Also, the same problem may exist in multiple industries (for example, getting realistic data for software testing is a common problem that data synthesis can solve), and the applications of synthetic data to solve that problem will therefore be relevant in these multiple industries. Because we discuss software testing, say, under only one heading does not mean that it would not be relevant in another.

The first industry that we will examine is manufacturing and distribution. We then give examples from health care, financial services, and transportation. The industry examples span the types of synthetic data we discussed, from generating structured data from real individual-level and aggregate data, to using simulation engines to generate large volumes of synthetic data.

Manufacturing and Distribution

The use of AIML in industrial robots, coupled with improved sensor technology, is ...

Get Accelerating AI with Synthetic Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.