Chapter 1. Introduction to Lakehouse Architecture
All data practitioners, irrespective of their job profiles, perform two common and foundational activities—asking questions and finding answers! Any data person, whether they’re a data engineer, data architect, data analyst, data scientist, or even a data leader like a chief information officer (CIO) or chief data officer (CDO), must be curious and ask questions.
Finding answers to complex questions is difficult. But the more challenging task is to ask the right questions. The “art of the possible” can only be explored by: (1) asking the right questions and (2) uncovering answers by leveraging the data. However simple this might sound, an organization needs an entire data platform to enable users to perform these tasks. This platform must support data ingestion and storage, provide tools for users to ask and discover new questions, perform advanced analysis, predict and forecast results, and generate insights. The data platform is the infrastructure that enables users to leverage data for business benefits.
To implement such data platforms, you need a robust data architecture—one that can help you define the core components of the data platform and establish the design principles for putting it into practice. Traditionally, organizations have used data warehouse or data lake architectures to implement their data platforms. Both of these architectural approaches have been widely adopted across industries. These architectures have ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access