Preface
I’ve been fortunate to get hired into medium-sized operations teams at large technology companies. All ops teams (a customary term for operations teams) share two interesting characteristics: compared to other engineering departments, they work under more pressure, and they attract bad attention much easier than good attention. Digital firefighting is the nature of the job. We might get noticed when things go awry and we fix them. If we don’t react fast enough, we definitely get noticed. If you know anyone in network operations, ask if that’s the way he or she feels about the job—I bet you’re going to get an answer along those lines.
Working in ops is all about effectiveness: there is no time for re-engineering. We must get things right the first time and we have to act fast. We go through a lot of reprioritizing and context-switching. There is relatively little room for creativity, at least the kind that doesn’t love constraints. All this makes operations a great place to learn and grow.
This book is based on experiences of working in ops. I was extremely lucky to work with some of the smartest people in the industry. I would like this book to be a tribute to all these invisible ops guys who struggle daily to maintain the highest standards of service availability.
In my career, I’ve stared at all sorts of timeseries plots, a lot of them. At one point it was my full-time job—no kidding. With time, I learned to extract meaning from data point fluctuations just by a brief glance, ...