Chapter 7. Metadata and Big Data

Data by itself means very little. After all, a piece of digital data is simply a collection of bits. To be discovered (found) or understood, this collection of bits needs something called metadata. In this chapter, we cover what metadata is and why it’s important to the self-service data model.

First, a basic definition: metadata is any information that gives you information about the data. It’s data about data.

The Three Types of Metadata

There are different kinds of metadata that describe different aspects of the data with which they are associated. Specifically, metadata can be categorized as one of three types: descriptive, structural, or administrative, as illustrated in Figure 7-1.

Depending on the type of metadata, it is used for different purposes. For example, descriptive metadata would be used when business analysts want to know what exactly the data consists of. Structural metadata would tell us the relationship between different datasets. And administrative metadata would inform us about ownership and rights management.

Classes of metadata
Figure 7-1. Classes of metadata

Descriptive Metadata

Descriptive metadata describes a collection of bits, or piece of digital data, so that it can be cataloged, discovered, identified, explored, and so on. It can include elements such as title, abstract, author, and keywords. It is typically meant to be read by humans. ...

Get Creating a Data-Driven Enterprise with DataOps now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.