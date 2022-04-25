Data Mesh

Data Mesh

by Zhamak Dehghani
Released April 2022
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781492092391

Book description

Many enterprises are investing in a next-generation data lake, hoping to democratize data at scale to provide business insights and ultimately make automated intelligent decisions. In this practical book, author Zhamak Dehghani reveals that, despite the time, money, and effort poured into them, data warehouses and data lakes fail when applied at the scale and speed of today's organizations. A distributed data mesh is a better choice.

Dehghani guides architects, technical leaders, and decision makers on their journey from monolithic big data architecture to a sociotechnical paradigm that draws from modern distributed architecture. A data mesh considers domains as a first-class concern, applies platform thinking to create self-serve data infrastructure, treats data as a product, and introduces a federated and computational model of data governance. This book shows you why and how.

  • Examine the current data landscape from the perspective of business and organizational needs, environmental challenges, and existing architectures
  • Analyze the landscape's underlying characteristics and failure modes
  • Get a complete introduction to data mesh principles and its constituents
  • Learn how to design a data mesh architecture
  • Move beyond a monolithic data lake to a distributed data mesh

Table of contents

  1. I. Why Data Mesh?
  2. 1. The Inflection Point
    1. Great Expectations of Data
    2. The Great Divide of Data
      1. Operational Data
      2. Analytical Data
      3. Analytical and Operational Data Misintegration
    3. Scale, Encounter of a New Kind
    4. Beyond Order
    5. Approaching the Plateau of Return
    6. Recap
  3. 2. After The Inflection Point
    1. Embrace Change in a Complex, Volatile and Uncertain Business Environment
      1. Align Business, Tech and Now Analytical Data
      2. Close The Gap Between Analytical and Operational Data
      3. Localize Data Change to Business Domains
      4. Reduce Accidental Complexity of Pipelines and Copying Data
    2. Sustain Agility in the Face of Growth
      1. Remove Centralized and Monolithic Bottlenecks of the Lake or the Warehouse
      2. Reduce Coordination of Data Pipelines
      3. Reduce Coordination of Data Governance
      4. Enable Autonomy
    3. Increase the Ratio of Value from Data to Investment
      1. Abstract Technical Complexity with a Data Platform
      2. Embed Product Thinking Everywhere
      3. Go Beyond The Boundaries
    4. Recap
  4. 3. Before The Inflection Point
    1. Evolution of Analytical Data Architectures
      1. First Generation: Data Warehouse Architecture
      2. Second Generation: Data Lake Architecture
      3. Third Generation: Multimodal Cloud Architecture
    2. Characteristics of Analytical Data Architecture
    3. Monolithic
      1. Monolithic Architecture
      2. Monolithic Technology
      3. Monolithic Organization
      4. The complicated monolith
      5. Technically-Partitioned Architecture
      6. Activity-oriented Team Decomposition
    4. Recap
  5. II. What is Data Mesh
  6. 4. Principle of Domain ownership
    1. Apply DDD’s Strategic Design to Data
    2. Domain Data Archetypes
      1. Source-aligned Domain Data
      2. Aggregate Domain Data
      3. Consumer-aligned Domain Data
    3. Transition to Domain Ownership
      1. Push Data Ownership Upstream
      2. Define Multiple Connected Models
      3. Embrace the Most Relevant Domain, and Don’t Expect the Single Source of Truth
      4. Hide the Data Pipelines as Domains’ Internal Implementation
    4. Recap
  7. 5. Principle of Data as a Product
    1. Apply Product Thinking to Data
      1. Baseline usability characteristics of a data product
    2. Transition to Data as a Product
      1. Include Data Product Ownership in Domains
    3. Recap
  8. 6. Principle of Self-Serve Data Platform
    1. Data Mesh Platform, Compare and Contrast
      1. Serving Autonomous Domain-oriented Teams
      2. Managing Autonomous and Interoperable Data Products
      3. A Continuous Platform of Operational and Analytical Capabilities
      4. Designed for Generalists Majority
      5. Favoring Decentralized Technologies
      6. Domain Agnostic
    2. Data Mesh Platform Thinking
      1. Enable Autonomous Teams to Get Value from Data
      2. Exchange Value with Autonomous and Interoperable Data Products
      3. Accelerate Exchange of Value by Lowering the Cognitive Load
      4. Scale out Data Sharing
      5. Support a Culture Of Embedded Innovation
    3. Transition To Self-serve Data Mesh Platform
      1. Design the APIs and Protocols First
      2. Prepare for Generalists Adoption
      3. Create Higher Level APIs to Manage Data Products
      4. Converge Data and Operational Platforms, Where Possible
      5. Build Experiences, not Mechanisms
      6. Begin with the Simplest Foundation, then Harvest to Evolve
    4. Recap
  9. 7. Principle Of Federated Computational Governance
    1. Apply Systems Thinking To Data Mesh Governance
      1. Maintain Dynamic Equilibrium Between Domain Autonomy And Global Interoperability
      2. Embrace Dynamic Topology As A Default State
      3. Utilize Automation And The Distributed Architecture
    2. Apply Federation To The Governance Model
      1. Federated Team
      2. Guiding Values
      3. Policies
      4. Incentives
    3. Apply Computation To The Governance Model
      1. Standards As Code
      2. Policies As Code
      3. Automated Tests
      4. Automated Monitoring
    4. Transition To Federated Computational Governance
      1. Federate Accountability To Domains
      2. Embed Policy Execution In Each Data Product
      3. Automate Enablement And Monitoring Over Interventions
      4. Model The Gaps
      5. Measure The Network Effect
      6. Embrace Change Over Constancy
    5. Recap
