Book description
The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is still a challenge, as no consensus has been reached so far. Data Lakes presents recent outcomes and trends in the field of data repositories. The main topics discussed are the data-driven architecture of a data lake; the management of metadata – supplying key information about the stored data, master data and reference data; the roles of linked data and fog computing in a data lake ecosystem; and how gravity principles apply in the context of data lakes. A variety of case studies are also presented, thus providing the reader with practical examples of data lake management.
Table of contents
- Cover
- Preface
- 1 Introduction to Data Lakes: Definitions and Discussions
- 2 Architecture of Data Lakes
- 3 Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures
-
4 Metadata in Data Lake Ecosystems
- 4.1. Definitions and concepts
- 4.2. Classification of metadata by NISO
- 4.3. Other categories of metadata
- 4.4. Sources of metadata
- 4.5. Metadata classification
- 4.6. Why metadata are needed
- 4.7. Business value of metadata
- 4.8. Metadata architecture
- 4.9. Metadata management
- 4.10. Metadata and data lakes
- 4.11. Metadata management in data lakes
- 4.12. Metadata and master data management
- 4.13. Conclusion
- 5 A Use Case of Data Lake Metadata Management
- 6 Master Data and Reference Data in Data Lake Ecosystems
- 7 Linked Data Principles for Data Lakes
-
8 Fog Computing
- 8.1. Introduction
- 8.2. A little bit of context
- 8.3. Every machine talks
- 8.4. The volume paradox
- 8.5. The fog, a shift in paradigm
- 8.6. Constraint environment challenges
- 8.7. Calculations and local drift
- 8.8. Quality is everything
- 8.9. Fog computing versus cloud computing and edge computing
- 8.10. Concluding remarks: fog computing and data lake
- 9 The Gravity Principle in Data Lakes
- Glossary
- References
- List of Authors
- Index
- End User License Agreement
Product information
- Title: Data Lakes
- Author(s):
- Release date: June 2020
- Publisher(s): Wiley-ISTE
- ISBN: 9781786305855
You might also like
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
book
Python Crash Course, 2nd Edition
This is the second edition of the best selling Python book in the world. Python Crash …
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
book
CCNP and CCIE Enterprise Core ENCOR 350-401 Official Cert Guide
Trust the best-selling Official Cert Guide series from Cisco Press to help you learn, prepare, and …