Chapter 7. Architecting a Streaming Data Mesh

In Chapters 3 through 6, we covered the pillars of a streaming data mesh. Now we will use that knowledge to architect a streaming data mesh. As we mentioned earlier in this book, the term “mesh” in “data mesh” was taken from the term “service mesh” in microservice architectures. We build upon that similarity to describe the parts of a streaming data mesh by using the same terms used to describe parts of a microservice architecture. We will describe each part of the architecture, so knowledge of microservice architecture is not a prerequisite. We will also consider multiple streaming data mesh solutions and list their benefits and trade-offs. The outcome will be an easy and clear framework that can be used to implement your own streaming data mesh.

Infrastructure

As stated in Chapter 1, we will be implementing a streaming data mesh with Kafka. Using Kafka is optional and can be replaced with Apache Pulsar or Redpanda; whichever you choose, we recommend using a fully managed and serverless streaming platform to relinquish the tasks of self-managing infrastructure. Likewise we will use ksqlDB as the stream processing engine. It is also available as a fully-managed or self-managed service. The following are some options that are fully managed:

  • DeltaStream

  • Popsink

  • Decodable

  • Materialized

  • RisingWave

  • Timeplus

Both Kafka and ksqlDB are stream processing engines that use SQL as the primary way of building streaming data pipelines. ...

Get Streaming Data Mesh now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.