O'Reilly logo

Data Lake Development with Big Data by Beulah Salome Purra, Pradeep Pasupuleti

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data Governance components

Data Governance comprises of metadata management and lineage tracking, Data Security and privacy, and Information Lifecycle Management components. These are common components that cut across the Data Intake, management, and consumption tiers of the Data Lake. In the following sections, let us explore these components in detail.

Metadata management and lineage tracking

Big Data often relies on extracting value from huge volumes of unstructured data. The first thing we do after this data enters the Data Lake is classify it and "understand" it by extracting its metadata. Metadata is the fundamental building block, on which the success of any Data Governance endeavor depends.

Metadata captures vital information about the data ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required