Chapter 2. Do We Know Why We Really Want Reliability?
Niall Murphy
Do we really understand reliability, or why we would want it?
This may seem like a strange question. It is an article of faith in this community that unreachable online services have no value. But even a moment’s thought will show you that’s simply not true. You yourself encounter intermittent computer failure almost every day. Some contexts even seem to expect it; with web services, users are highly accustomed to hitting refresh or (for more difficult problems) clearing cookies, restarting a browser, or restarting a machine. Even services themselves have retry protocols.
A certain amount of fudge is baked into every human–computer interaction. Even for longer outages, people almost always come back if you’re down for a few minutes, and have even more patience, depending on the uniqueness of the service provided.
It’s anecdotal, but suggestive: I had a conversation with a very well-known company a couple of years ago when they said they didn’t put any money into reliability because their particular customer base had nowhere else to go. Therefore, time they spent on reliability would be time they wouldn’t spend on capturing revenue; it wasn’t worth it.
I gasped inwardly at the time, but I’ve thought about it often since, and I turn the question toward us, as a community, now: do we have any real argument ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access