Chapter 10. Consistency and Consensus
An ancient adage warns, “Never go to sea with two chronometers; take one or three.”
Frederick P. Brooks Jr., The Mythical Man-Month: Essays on Software Engineering (1995)
Lots of things can go wrong in distributed systems, as discussed in Chapter 9. If we want a service to continue working correctly despite those things going wrong, we need to find ways of tolerating faults.
One of the best tools we have for fault tolerance is replication. However, as we saw in Chapter 6, having multiple copies of the data on multiple replicas increases the risk of inconsistencies. Reads might be handled by a replica that is not up-to-date, yielding stale results. If multiple replicas can accept writes, we have to deal with conflicts between values that were concurrently written on different replicas. At a high level, there are two competing philosophies for dealing with such issues:
- Eventual consistency
-
In this philosophy, the fact that a system is replicated is made visible to the ...