Chapter 1. Introduction to Data Catalogs
In this chapter, you’ll learn how a data catalog works, who uses them, and why. First, we’ll go over the core functionalities of a data catalog and how it creates an overview of your organization’s IT landscape, how the data is organized, and how it makes searching for your data easy. Search is often underutilized and undervalued as part of a data catalog, which is a huge detriment to data catalogs. As such, we’ll talk about your data catalog as a search engine that will unlock the potential for success.
In this chapter, you’ll also learn about the benefits of a data catalog in an organization: a data catalog improves data discoverability, subsequently ensuring data governance and enhancing data-driven innovation. Moreover, you’ll learn about how to set up a data discovery team and you’ll learn who the users of your data catalog are. I’ll wrap up this chapter by explaining the roles and responsibilities in the data catalog.
OK, off we go.
The Core Functionality of a Data Catalog
At its core, a data catalog is an organized inventory of the data in your company. That’s it.
The data catalog provides an overview at a metadata level only, and thus no actual data values are exposed. This is the great advantage of a data catalog: you can let everyone see everything without fear of exposing confidential or sensitive data. In Figure 1-1, you can see a high-level description of a data catalog.
Figure 1-1. High-level view of a data catalog
A data ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access