O'Reilly logo

Data Lake Development with Big Data by Beulah Salome Purra, Pradeep Pasupuleti

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Data Discovery and metadata

Data Discovery deals with the identification of related data assets, making them discoverable and guiding the data consumers to relevant datasets.

The efficiency of Data Discovery depends upon the amount and quality of the metadata that is captured as the data moves across the various tiers in the Data Lake. Metadata keeps track of all the data assets that reside on a Data Lake; it helps data consumers to find the relevant data. Metadata identifies and maintains relationships between data, right from the time the data is ingested, enhanced, transformed, and evolved. It guides consumers to related datasets that can be combined and integrated.

Semantic metadata captures the semantics of the data; semantics is the ability ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required