3 Operational blindness
This chapter covers
- Making changes in operations functions
- Creating useful system metrics for your application
- Creating useful logging habits
When you launch a system, you expect it to perform a set of tasks, in a certain order, with a few expected results. Sometimes you might expect an error in the process, and you’ll need to perform some sort of cleanup process around that error. But the complexity of getting the system to work in the best of times leaves a lot of room for improvement in the way the tool performs in the worst of times.
Creating tools to confirm that work is happening the way you expected gets omitted, leaving you with no clear view as to what’s happening in your system. Instead, teams rely on easily ...