Cleaning Up the Data Lake with an Operational Data Hub

Book description

The data lake was once heralded as the answer to the flood of big data that arrived in a variety of structured and unstructured formats. But, due to the ease of integration and the lack of governance, data lakes in many companies have devolved into unusable data swamps. This short ebook shows you how to solve this problem using an Operational Data Hub (ODH) to collect, store, index, cleanse, harmonize, and master data of all shapes and formats.

Gerhard Ungerer—CTO and co-founder of Random Bit LLC—explains how the ODH supports transactional integrity so that the hub can serve as integration point for enterprise applications. You’ll also learn how the ODH helps you leverage the investment in your data lake (or swamp), so that the data trapped there can finally be ingested, processed, and provisioned.

With this ebook, you’ll learn how an ODH:

  • Allows you to focus on categorizing data for easy and fast retrieval
  • Provides flexible storage models, indexing support, query capabilities, security, and a governance framework
  • Delivers flexible storage models; support for indexing, scripting, and automation; query capabilities; transactional integrity; and security
  • Includes a governance model to help you access, ingest, harmonize, materialize, provision, and consume data

Product information

  • Title: Cleaning Up the Data Lake with an Operational Data Hub
  • Author(s): Gerhard Ungerer
  • Release date: March 2018
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492027379