Chapter 10. Architecting for Reliability

This chapter focuses on designing systems from the ground up with SLOs in mind. It explores the design and development of an example system through the lens of the system’s architecture. Examining the system at a high level, we will look at the technical reasons behind various design choices made in order to meet potential SLOs.

From the perspective of system architects, we begin this design exercise with the problem statement, or specification. We gather requirements, including the SLOs for the expected interactions with the system’s users. User journeys, which represent the same concept as SLIs (see Chapter 3), help us understand these interactions, as well as the implications for the user when the system does not meet its objectives. They help focus our attention on the path of a request from user to service, and back.

To illustrate the importance of thinking about user journeys, consider the difference between a single-serving website that plays a recording of a sad trombone and a website for a multinational bank that provides access to funds and payment information in real time. A superficial distinction between these two lies in the implication that one is “serious,” handling our money, whereas the other is entertainment. Although this is indeed true, the deeper distinction is that failures in the money-handling system may have immediate, grave, and irreversible effects on its users, while this particular entertainment ...

Get Implementing Service Level Objectives now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.