Book description
The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is still a challenge, as no consensus has been reached so far. Data Lakes presents recent outcomes and trends in the field of data repositories. The main topics discussed are the data-driven architecture of a data lake; the management of metadata – supplying key information about the stored data, master data and reference data; the roles of linked data and fog computing in a data lake ecosystem; and how gravity principles apply in the context of data lakes. A variety of case studies are also presented, thus providing the reader with practical examples of data lake management.
Table of contents
- Cover
- Preface
- 1 Introduction to Data Lakes: Definitions and Discussions
- 2 Architecture of Data Lakes
- 3 Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures
-
4 Metadata in Data Lake Ecosystems
- 4.1. Definitions and concepts
- 4.2. Classification of metadata by NISO
- 4.3. Other categories of metadata
- 4.4. Sources of metadata
- 4.5. Metadata classification
- 4.6. Why metadata are needed
- 4.7. Business value of metadata
- 4.8. Metadata architecture
- 4.9. Metadata management
- 4.10. Metadata and data lakes
- 4.11. Metadata management in data lakes
- 4.12. Metadata and master data management
- 4.13. Conclusion
- 5 A Use Case of Data Lake Metadata Management
- 6 Master Data and Reference Data in Data Lake Ecosystems
- 7 Linked Data Principles for Data Lakes
-
8 Fog Computing
- 8.1. Introduction
- 8.2. A little bit of context
- 8.3. Every machine talks
- 8.4. The volume paradox
- 8.5. The fog, a shift in paradigm
- 8.6. Constraint environment challenges
- 8.7. Calculations and local drift
- 8.8. Quality is everything
- 8.9. Fog computing versus cloud computing and edge computing
- 8.10. Concluding remarks: fog computing and data lake
- 9 The Gravity Principle in Data Lakes
- Glossary
- References
- List of Authors
- Index
- End User License Agreement
Product information
- Title: Data Lakes
- Author(s):
- Release date: June 2020
- Publisher(s): Wiley-ISTE
- ISBN: 9781786305855
You might also like
book
Architecting Data Lakes
Many organizations use Hadoop-driven data lakes as an adjunct staging area for their enterprise data warehouses …
book
Data Lake Maturity Model
Data is changing everything. Many industries today are being fundamentally transformed through the accumulation and analysis …
book
Data Lakes For Dummies
Take a dive into data lakes “Data lakes” is the latest buzz word in the world …
book
Big Data
Manipulating and processing masses of digital data is never a purely technical activity. It requires an …