Chapter 5. Zero and the Building Blocks of Babel: How to forge the atoms of reliable infrastructure.

London Bridge is falling down,
Falling down, falling down.
London Bridge is falling down,
My fair lady.

Build it up with bricks and mortar,
Bricks and mortar, bricks and mortar,
Build it up with bricks and mortar,
My fair lady.

Set a man to watch all night,
Watch all night, watch all night,
Set a man to watch all night,
My fair lady.

Suppose the man should fall asleep,
Fall asleep, fall asleep,
Suppose the man should fall asleep?
My fair lady.

— Traditional nursery rhyme, excerpt

Until the early 2000s, the fact that computer software was riddled with potential instabilities was of little practical consequence. Software would crash, computers would be rebooted, and life would go on. Users learned to work defensively around the problems with their personal workstations, by making backups and regularly saving data. Even when networking became common, and there were services like email and file sharing, a reasonable equilibrium could be maintained between ‘downtime’ caused by perturbations from a noisy environment, and human repairs to the system (by custodians known as system administrators), to satisfy most requirements with only minor inconvenience, just by tending computers by hand. The approach used was much like an emergency services model, e.g. like fire-fighting or ambulance. You wait until a problem is reported, then you sound an alarm; technicians rush to the rescue and try to ...

Get In Search of Certainty now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.