Chapter 8. Mapping Data with Metadata Knowledge Graphs
Modern enterprises are extremely rich in data. But that data is distributed across silos, heterogeneous, and often of variable quality.
Being able to understand where your data resides, how it’s processed, and who consumes it is an important part of running an enterprise. It’s a key component of data governance and increasingly important for self-serve data consumers.
A metadata knowledge graph is an enterprise-wide map which records the shape and location of data, the systems which process that data, and its consumers. Importantly, a metadata knowledge graph links data, processes, and consumers so that the provenance of data is explicit and easy to reason about (e.g., for compliance and regulatory demands).
In this chapter, you’ll learn about the challenges of stewarding data in the modern enterprise and how a metadata knowledge graph can help. You’ll see how the modern distributed data landscape can be (logically) reunified by a metadata knowledge graph and how complex systems architectures can be tamed with the same technique. This is reinforced by an end-to-end example designed to resemble a typical enterprise. But first, the challenge of data stewardship must be addressed.
The Challenge of Distributed Data Stewardship
As organizations evolve, different departments implement applications and processes to solve their needs. Individual departments may store some of the same information, so it is not uncommon for duplicate ...