4

Metadata, Semantics, and Triples

Abstract

Data has no value unless it has been described, and described data has no meaning unless it has been associated with an identifier. The “triple,” consisting of a data value, its descriptor, plus its associated identifier, is the basic unit of meaning (semantics) in information science. The concept of triples will be new to most readers, but this simple concept has enormous value whenever we work with complex types of data and whenever we need to merge and integrate data obtained from multiple sources. It is important that statisticians and researchers, accustomed to working with small or simple sets of data, become aware of the necessity for semantic rigor when designing and analyzing Big Data.

Get Principles and Practice of Big Data, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.