Chapter 12. Designing Data Products Using JSON Schema

Everything is designed. Few things are designed well.

Brian Reed, renowned industrial designer

Chapter 4 introduced the idea of a data product as a self-contained object with four facets: data, structure, meaning, and context. In practice, some of these facets tend to be ignored. For example, data scientists get collections of CSVs with inconsistent rows and insufficient information about what each column means, when the dataset was created, and so on. The lack of these facets introduces ambiguity, therefore extracting key insights out of these badly designed sources of data becomes extremely challenging, no matter how much expensive tooling or expertise is thrown at the task.

To address this data problem, you learned a proven methodology for achieving alignment in your organization. You also learned the fundamental technologies that make this methodology applicable: JSON and JSON Schema. In this chapter, we will put all of it into practice by walking you through how to design a data product with a concept-first approach using JSON and JSON Schema. We will look at each facet of a data product in sequence, building upon the JSON Schema registry you deployed in Chapter 11.

First Facet: Data

The first facet of a data product we will look into is data itself. Chapter 10 introduced the idea of spectrums of success, using as an example a user journey through completing a website sign-up form. In such an example, the user starts ...

Get Unifying Business, Data, and Code now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.