Chapter 6. Mirroring Clusters with MirrorMaker

The action of copying data between two Kafka clusters is called mirroring. This term is used to distinguish this process from the term “replication” that usually refers to the way data in Kafka is shared across brokers within the cluster. However, both terms are often used by the community when talking about copying data between clusters.

The idea of mirroring data between clusters is pretty much as old as Kafka itself. At Kafka’s inception, mirroring was a feature of the broker, before it was then separated into its own tool in early 2012. The tool was a standalone application named MirrorMaker, but due to its initial design, it had a number of limitations and was hard to operate. So in 2019, a new mirroring tool based on Kafka Connect was introduced via KIP-382 called MirrorMaker2. The initial MirrorMaker tool is now deprecated since Kafka 3.0 and will be removed in Kafka 4.0 (via KIP-720); the new tool is now often simply called MirrorMaker or MM2.

In this chapter we only cover the new tool, and we refer to it as MirrorMaker. We introduce use cases that rely on mirroring, explain how the MirrorMaker connectors work, and finally demonstrate how to use them through some examples.

Introduction to Mirroring

Kafka scales very well, and it’s possible to run a single cluster with an extremely large capacity. However, in many cases it’s preferable to have multiple smaller clusters. This could be to better serve different geographies, ...

Get Kafka Connect now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.