book

Data Mesh in Practice

by Max Schultze, Arif Wider

December 2021

Beginner to intermediate

58 pages

1h 17m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Introduction
I. What Is Data Mesh and Why Do We Need It?
1. Pain Points of Centralized Data Responsibility
The Data Warehouse ApproachThe Data Lake ApproachCentralized Data Responsibility
2. The Pillars of Data Mesh
Decentralized Domain Ownership of DataData as a ProductSelf-Serve Data Infrastructure as a PlatformFederated Computational Data Governance
II. The Data Mesh Journey
3. Getting Started: A Data Product–Centered Mindset Shift
The Rise of Big Data TechnologiesCase Study: Pain Points of Unclear Data OwnershipMoving Toward Decentralized Data ProductsUnderstanding the Different PerspectivesCreating IncentivesWhere and How to Get StartedCreating New Infrastructure Alongside the First Data ProductAdapting Existing Infrastructure
4. Scaling the Mesh: Self-Serve Data Infrastructure
Pain Points of Central Data ResponsibilityCase Study: Centralized Compute CapabilitiesData Infrastructure Capabilities
5. Sustaining the Mesh: Federated Computational Data Governance
Ensure Interoperability Through Semantic Cross-Domain ModelingUse Automation to Enforce Global Rules Without Centralization
6. Industry Practices
Common PitfallsOverloading Your PeopleCreating a Platform with Central Data ResponsibilityBuilding the Perfect Platform Up FrontMisunderstanding the Data Mesh ConceptBest PracticesStart Small, but with CommitmentDefine Your Domains Following Your Business CapabilitiesEvangelize Data MeshApply Product Thinking to Platform Development
Closing Remarks

Acknowledgments
About the Authors

Content preview from Data Mesh in Practice

Chapter 5. Sustaining the Mesh: Federated Computational Data Governance

Building a data mesh at company scale addresses several different angles of working with data. We have already covered what it takes locally to start building a data product. We also introduced how it is possible to support those local data product builders through infrastructure platform capabilities to ease their journey toward high-quality data products. What we have not addressed yet is how can we make sure various data products are not starting to drift apart? How can we prevent different domains from becoming isolated silos of information?

Ensure Interoperability Through Semantic Cross-Domain Modeling

We wanted to build an integrated viewpoint between the sales data of our company and the behavioral data of our customers that was collected along their journey on our platform. Both areas had high-quality data products that were well described, easy to find, provided strong guarantees, and had contact product people to work with and discuss those products’ usages. Unfortunately, both data products were residing in different source systems, and we heavily underestimated the integration effort between them. After one month of integration effort for each of those two systems into our usual analytics platform, we had to realize that those well-defined data products were not compatible at all. The identifiers used in each product not only had different data formats but also followed different semantics. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781098108502

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Data Mesh in Practice

by Max Schultze, Arif Wider

Chapter 5. Sustaining the Mesh: Federated Computational Data Governance

Ensure Interoperability Through Semantic Cross-Domain Modeling

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.