Chapter 13. Distributed State

A distributed system is one in which the failure of a computer you didn’t even know about can render your own computer unusable.1

Leslie Lamport, DEC SRC Bulletin Board (May 1987)

Distributed state is at the heart of cloud computing. Well-designed distributed systems can provide higher levels of fault tolerance and availability and can more easily scale horizontally to handle larger volumes of data and better share the load. However, distributed state is an advanced subject, so up until now, we’ve pretty much avoided the topic altogether.

You might recall that in Chapters 5 and 7 I discussed the problem of “state” in distributed computing, discussing at some length why state is the enemy of scalability. At the time, I suggested externalizing any shared state to a database, or even simply eliminating it altogether. That isn’t necessarily helpful advice because it isn’t always possible (or desirable).

This chapter seeks to remedy this injustice by directly addressing the difficult problem of “state” in cloud native systems. In this chapter, I’ll introduce the CAP theorem and its implications and offer some practical advice about how best to approach distributed state based on the performance priorities of your application. Finally, I’ll cover a few of the more common algorithms, and I’ll use one of those—Raft—to enhance our key-value store.

Distributed State Is Hard

It’s generally well-understood that implementing and managing distributed state can ...

Get Cloud Native Go, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.