Skip to Main Content
Data Lake Development with Big Data
book

Data Lake Development with Big Data

by Pradeep Pasupuleti, Beulah Salome Purra
November 2015
Beginner to intermediate content levelBeginner to intermediate
164 pages
4h 10m
English
Packt Publishing
Content preview from Data Lake Development with Big Data

Data Governance components

Data Governance comprises of metadata management and lineage tracking, Data Security and privacy, and Information Lifecycle Management components. These are common components that cut across the Data Intake, management, and consumption tiers of the Data Lake. In the following sections, let us explore these components in detail.

Metadata management and lineage tracking

Big Data often relies on extracting value from huge volumes of unstructured data. The first thing we do after this data enters the Data Lake is classify it and "understand" it by extracting its metadata. Metadata is the fundamental building block, on which the success of any Data Governance endeavor depends.

Metadata captures vital information about the data ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Lake Maturity Model

Data Lake Maturity Model

Scott Gidley, Andy Oram
Data Lakes

Data Lakes

Anne Laurent, Dominique Laurent, Cédrine Madera
Architecting Data Lakes

Architecting Data Lakes

Ashish Thusoo, Ben Sharma

Publisher Resources

ISBN: 9781785888083