Skip to Content
Designing Data-Intensive Applications
book

Designing Data-Intensive Applications

by Martin Kleppmann
March 2017
Intermediate to advanced
616 pages
19h 39m
English
O'Reilly Media, Inc.
Book available
Content preview from Designing Data-Intensive Applications

Chapter 5. Replication

The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair.

Douglas Adams, Mostly Harmless (1992)

Replication means keeping a copy of the same data on multiple machines that are connected via a network. As discussed in the introduction to Part II, there are several reasons why you might want to replicate data:

  • To keep data geographically close to your users (and thus reduce access latency)

  • To allow the system to continue working even if some of its parts have failed (and thus increase availability)

  • To scale out the number of machines that can serve read queries (and thus increase read throughput)

In this chapter we will assume that your dataset is so small that each machine can hold a copy of the entire dataset. In Chapter 6 we will relax that assumption and discuss partitioning (sharding) of datasets that are too big for a single machine. In later chapters we will discuss various kinds of faults that can occur in a replicated data system, and how to deal with them.

If the data that you’re replicating does not change over time, then replication is easy: you just need to copy the data to every node once, and you’re done. All of the difficulty in replication lies in handling changes to ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Designing Data-Intensive Applications, 2nd Edition

Designing Data-Intensive Applications, 2nd Edition

Martin Kleppmann, Chris Riccomini

Publisher Resources

ISBN: 9781491903063Errata PageSupplemental Content