Chapter 7
HeartBeat
Show a server is available by periodically sending a message to all the other servers.
Problem
When multiple servers form a cluster, each server is responsible for storing some portion of the data, based on the partitioning and replication schemes used. Timely detection of server failures is important for taking corrective actions by making some other server responsible for handling requests for the data on a failed server.
Solution
Periodically send a request to all the other servers indicating liveness of the sending server (Figure 7.1). Select the request interval to be more than the network round trip time between the servers. All the listening servers wait for the timeout interval, which is a multiple of the request ...
Get Patterns of Distributed Systems now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.