Chapter 14. Two Mistakes High

Surely, this is OK…

Consider the following anecdote I once overheard:

We were wondering how changing a setting on our MySQL database might impact our performance, but we were worried that the change might cause our production database to fail. Because we didn’t want to bring down production, we decided to make the change to our backup (replica) database, instead. After all, it wasn’t being used for anything at the moment.

Makes sense, right? Have you ever heard this rationale before?

Well, the problem here is that the database was being used for something. It was being used to provide a backup for production. Except, it couldn’t be used that way anymore.

You see, the backup database was essentially being used as an experimental playground for trying different types of settings. The net result was that the backup database began to drift away from the primary production database as settings began to change over time.

Then, one day, the inevitable happened.

The production database failed.

The backup database initially did what it was supposed to do. It took over the job of the primary database. Except, it really couldn’t. The settings on the backup database had wandered so far away from those required by the primary database that it could no longer reliably handle the same traffic load that the primary database handled.

The backup database slowly failed, and the site went down.

This is a true story. It’s a story about best intentions. You have a backup, ...

Get Architecting for Scale now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.