Chapter 3. Distributed Systems Essentials

As I described in Chapter 2, scaling a system naturally involves adding multiple independently moving parts. We run our software components on multiple machines and our databases across multiple storage nodes, all in the quest of adding more processing capacity. Consequently, our solutions are distributed across multiple machines in multiple locations, with each machine processing events concurrently, and exchanging messages over a network.

This fundamental nature of distributed systems has some profound implications on the way we design, build, and operate our solutions. This chapter provides the basic information you need to know to appreciate the issues and complexities of distributed software systems. I’ll briefly cover communications networks hardware and software, remote method invocation, how to deal with the implications of communications failures, distributed coordination, and the thorny issue of time in distributed systems.

Communications Basics

Every distributed system has software components that communicate over a network. If a mobile banking app requests the user’s current bank account balance, a (very simplified) sequence of communications occurs along the lines of:

  1. The mobile banking app sends a request over the cellular network addressed to the bank to retrieve the user’s bank balance.

  2. The request is routed across the internet to where the bank’s web servers are located.

  3. The bank’s web server authenticates the request ...

Get Foundations of Scalable Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.