Data Mesh in Practice

Book description

The data mesh is poised to replace data lakes and data warehouses as the dominant architectural pattern in data and analytics. By promoting the concept of domain-focused data products that go beyond file sharing, data mesh helps you deal with data quality at scale by establishing true data ownership. This approach is so new, however, that many misconceptions and a general lack of practical experience for implementing data mesh are widespread.

With this report, you'll learn how to successfully overcome challenges in the adoption process. By drawing on their experience building large-scale data infrastructure, designing data architectures, and contributing to data strategies of large and successful corporations, authors Max Schultze and Arif Wider have identified the most common pain points along the data mesh journey.

You'll examine the foundations of the data mesh paradigm and gain both technical and organizational insights. This report is ideal for companies just starting to work with data, for organizations already in the process of transforming their data infrastructure landscape, as well as for advanced companies working on federated governance setups for a sustainable data-driven future.

This report covers:

  • Data mesh principles and practical examples for getting started
  • Typical challenges and solutions you'll encounter when implementing a data mesh
  • Data mesh pillars including domain ownership, data as a product, and infrastructure as a platform
  • How to move toward a decentralized data product and build a data infrastructure platform

Table of contents

  1. Introduction
  2. I. What Is Data Mesh and Why Do We Need It?
  3. 1. Pain Points of Centralized Data Responsibility
    1. The Data Warehouse Approach
    2. The Data Lake Approach
    3. Centralized Data Responsibility
  4. 2. The Pillars of Data Mesh
    1. Decentralized Domain Ownership of Data
    2. Data as a Product
    3. Self-Serve Data Infrastructure as a Platform
    4. Federated Computational Data Governance
  5. II. The Data Mesh Journey
  6. 3. Getting Started: A Data Product–Centered Mindset Shift
    1. The Rise of Big Data Technologies
    2. Case Study: Pain Points of Unclear Data Ownership
    3. Moving Toward Decentralized Data Products
      1. Understanding the Different Perspectives
      2. Creating Incentives
    4. Where and How to Get Started
      1. Creating New Infrastructure Alongside the First Data Product
      2. Adapting Existing Infrastructure
  7. 4. Scaling the Mesh: Self-Serve Data Infrastructure
    1. Pain Points of Central Data Responsibility
    2. Case Study: Centralized Compute Capabilities
    3. Data Infrastructure Capabilities
  8. 5. Sustaining the Mesh: Federated Computational Data Governance
    1. Ensure Interoperability Through Semantic Cross-Domain Modeling
    2. Use Automation to Enforce Global Rules Without Centralization
  9. 6. Industry Practices
    1. Common Pitfalls
      1. Overloading Your People
      2. Creating a Platform with Central Data Responsibility
      3. Building the Perfect Platform Up Front
      4. Misunderstanding the Data Mesh Concept
    2. Best Practices
      1. Start Small, but with Commitment
      2. Define Your Domains Following Your Business Capabilities
      3. Evangelize Data Mesh
      4. Apply Product Thinking to Platform Development
  10. Closing Remarks
  11. Acknowledgments
  12. About the Authors

Product information

  • Title: Data Mesh in Practice
  • Author(s): Max Schultze, Arif Wider
  • Release date: December 2021
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098108496