Skip to content
O'Reilly home
Data Lake

Data Mesh in Practice

How to set the foundations for federated data ownership

enter image description here

This event has ended.

What you’ll learn and how you can apply it

By the end of this live online course, you’ll understand:

  • The consequences of unclear data ownership
  • What a scalable structure of domain-driven, federated responsibilities looks like
  • How a shared data infrastructure platform can contribute

And you’ll be able to:

  • Facilitate steps toward federated data ownership in your company
  • Provide data in such a way that others can create value from it
  • Support data ownership by providing domain-agnostic infrastructure tooling

This live event is for you because…

  • You’re a software or data engineer.
  • You work with data production, infrastructure, or consumption.
  • You want to become a data product owner.

Prerequisites

  • list text here- Familiarity with distributed data processing
  • A basic understanding of Python

Recommended preparation:

Recommended follow-up:

Schedule

The timeframes are only estimates and may vary according to how the class is progressing.

Introduction to data mesh (25 minutes)

  • Presentation: What’s the data mesh paradigm?; Why was it invented?
  • Exercise: Jupyter Notebook setup

The data consumer perspective (45 minutes)

  • Exercise: Calculate a set of business KPIs from a prepared, fairly undocumented dataset
  • Presentation: Overview of data mesh—product thinking for data, domain-driven design applied to distributed data, and platform thinking for data infrastructure; issues on the consumer side
  • Q&A

Break (5 minutes)

The data producer perspective (45 minutes)

  • Presentation: What to do on the data producer side; how to create a data product; how to think about domain boundaries
  • Exercise: Rewrite the introduced dataset with a proper column description; create a schema and dataset description
  • Presentation: why building a good data product is hard
  • Q&A

Break (5 minutes)

The data infrastructure platform perspective (45 minutes)

  • Exercise: answer an access request by calling some prepared functions; answer repeatedly to many access requests
  • Presentation: What makes a good data infrastructure platform?—domain agnostic, self-service, etc.; the trap of taking centralized responsibility for data; platform thinking—multitenancy, how to enable interoperability, and how to stay out of domain responsibility
  • Demo: Build a platform capability / self service tool
  • Q&A

Conclusion and Wrap up (10 minutes)

  • Presentation: the goal state; Key learnings; what did we NOT talk about? Followup suggestions
  • Q&A

Your Instructors

  • Max Schultze

    Max Schultze is a lead data engineer working on building a data lake at Zalando, Europeâ??s biggest online platform for fashion. His focus lies on building data pipelines at petabytes scale and productionizing Spark and Presto as analytical platforms inside the company. He graduated from the Humboldt University of Berlin, where he took park in the universityâ??s initial development of Apache Flink.

  • Arif Wider

    Arif Wider is a professor of software engineering at HTW Berlin, Germany, and a lead technology consultant with ThoughtWorks. At Thoughtworks, he worked with Zhamak Dehghani, who coined the term Data Mesh in 2019. Outside of teaching, Arif enjoys building scalable software that makes an impact, as well as building teams that create such software. More specifically, he is fascinated by applications of Artificial Intelligence and how effectively building such applications requires data scientists and developers (like himself) to work closely together.

Start your free 10-day trial

Get started

Want to learn more at events like these?

Get full access to O'Reilly online learning for 10 days—free.

  • checkmark50k+ videos, live online training, learning paths, books, and more.
  • checkmarkBuild playlists of content to share with friends and colleagues.
  • checkmarkLearn anywhere with our iOS and Android apps.
Start Free TrialNo credit card required.