Preface
This is a book for practitioners of the scientific discipline of chaos engineering. Chaos engineering is part of the overall resilience engineering approach and serves the specific purpose of surfacing evidence of system weaknesses before those weaknesses result in crises such as system outages. If you care about how you, your colleagues, and your entire sociotechnical system collectively practice and respond to threats to your system’s reliability, chaos engineering is for you!
Audience
This book is for people who are in some way responsible for their code in production. That could mean developers, operations, DevOps, etc. When I say they “are in some way responsible,” I mean that they take responsibility for the availability, stability, and overall robustness of their system as it runs, and may even be part of the group assembled when there is a system outage.
Perhaps you’re a site reliability engineer (SRE) looking to improve the stability of the systems you are responsible for, or you’re working on a team practicing DevOps where everyone owns their code in production. Whatever your level of responsibility, if you care about how your code runs in production and about the bigger picture of how well production is running for your organization, this book aims to help you meet those challenges.
What This Book Is About
This is a practical guide to doing chaos engineering using free and open source tools, in particular the Chaos Toolkit (see “About the Samples”). Written ...