Chapter 4. Getting Tooled Up for Automated Chaos Engineering

Automated chaos experiments give you the power to potentially explore system weaknesses at any time. There are many turbulent condition–inducing tools out there, from the original Netflix Chaos Monkey for infrastructure-level chaos, to application-level tools such as the Chaos Monkey for Spring Boot. Here you are going to use the Chaos Toolkit and its ecosystem of extensions.

The Chaos Toolkit was chosen because it is free and open source and has a large ecosystem of extensions that allow you to fine-tune your chaos experiments to your own needs.1 The Chaos Toolkit also uses a chaos experiment format that you specify using YAML or JSON.2 As shown in the diagram in Figure 4-1, the toolkit takes your chaos experiment definition and orchestrates the experiment against your target system.

An image of how the chaos toolkit is used.
Figure 4-1. You use the Chaos Toolkit to orchestrate your chaos experiments against your target system

Don’t Sweat the Drivers

Don’t worry about the drivers mentioned in Figure 4-1 for now. You’ll learn all about Chaos Toolkit drivers in the following chapters when you use the toolkit to run experiments against various target systems. You’ll even learn how to extend your Chaos Toolkit by creating your own custom drivers (see Chapter 8).

Now it’s time to get the Chaos Toolkit installed. This means installing the Chaos Toolkit command-line interface (CLI), appropriately called chaos. With the chaos command installed, you can take control of your experiments by running them locally from your own machine.

Installing Python 3

The Chaos Toolkit CLI is written in Python and requires Python 3.5+. Make sure you have the right version of Python installed by running the following command and inspecting the output you get:

$ python3 -V
Python 3.6.4

If you don’t have Python 3 on your machine (it is not bundled with macOS, for example), instructions for how to install it for your operating system are in the Chaos Toolkit documentation.

Installing the Chaos Toolkit CLI

The Chaos Toolkit CLI adds the chaos command to your system so that you can:

  • Discover and record what types of chaos and information can be sourced from your target system using the chaos discover command.

  • Initialize new chaos experiments using the chaos init command.

  • Execute your JSON- or YAML-formatted automated chaos experiments using the chaos run command.

  • Optionally execute the report command to produce a human-readable report of your experiment’s findings (see “Creating and Sharing Human-Readable Chaos Experiment Reports”).

Using these commands, you have the workflow shown in Figure 4-2 at your fingertips.

An image of the chaos toolkit workflow.
Figure 4-2. Starting with Discover, or jumping straight to Run, this is the workflow for the Chaos Toolkit

You should have Python successfully installed by now, so you’re ready to install the Chaos Toolkit’s chaos command. But first, to keep your environment neat and to avoid any Python module conflicts, it’s a good idea to create a Python virtual environment just for the Chaos Toolkit and its supporting dependencies.

To create a Python virtual environment3 (called chaostk in this example), use the following command:

$ python3 -m venv ~/.venvs/chaostk

Once the environment has been created, it needs to be activated. To activate the chaostk virtual environment, enter the following command:

$ source  ~/.venvs/chaostk/bin/activate

Your command prompt will change to show that you are working in your new virtual environment by displaying the name of the virtual environment before the prompt:

(chaostk) $

Always Check Your Virtual Environment

I always check that I’m in the right virtual environment if the project I’m working on starts behaving oddly. It’s easy to forget to activate an environment and start working with the Python and dependencies that have been installed globally, especially when you’re changing between terminal windows, and even more so if you’re new to Python.

Finally, it’s time to install the Chaos Toolkit chaos command. You can do this using the pip command:

(chaostk) $ pip install chaostoolkit

After pip has successfully completed its work, you should have a shiny new chaos command to play with. For now, just check that all is present and correct by entering the chaos --help command:

(chaostk) $ chaos --help
chaos --help
Usage: chaos [OPTIONS] COMMAND [ARGS]...

Options:
  --version           Show the version and exit.
  --verbose           Display debug level traces.
  --no-version-check  Do not search for an updated version of the
                      chaostoolkit.
  --change-dir TEXT   Change directory before running experiment.
  --no-log-file       Disable logging to file entirely.
  --log-file TEXT     File path where to write the command's log.  [default:
                      chaostoolkit.log]
  --settings TEXT     Path to the settings file.  [default:
                      /Users/russellmiles/.chaostoolkit/settings.yaml]
  --help              Show this message and exit.

Commands:
  discover  Discover capabilities and experiments.
  init      Initialize a new experiment from discovered...
  run       Run the experiment loaded from SOURCE, either...
  validate  Validate the experiment at PATH.

chaos --help Output

The output of your chaos --help command may vary from the preceding output. It’s rare that commands are added to the default Chaos Toolkit CLI, but over time it is possible.

Each of the commands in the Chaos Toolkit workflow contributes to the experimentation and validation phases of the Chaos Engineering Learning Loop (see Figure 4-3).

An image of the chaos toolkit commands and the learning loop phases they contribute to.
Figure 4-3. The Chaos Toolkit commands that support each of the phases of the Chaos Engineering Learning Loop

All of the commands that help you create and run chaos experiments (discover, init, run, report) are used when you are exploring and discovering weaknesses. The optional report command, not shown so far but discussed in Chapter 7, is there for when you are collaboratively analyzing a detected weakness. Finally, run and report are used again when you are validating that a weakness has been overcome.

Summary

Success! Your Chaos Toolkit is now successfully installed, and the chaos command is at your disposal. Now it’s time to use that command.

In the next chapter, you’ll bring to life a very simple system that will be a target for your chaos experiments. You’ll then run your first chaos engineering experiment to complete a full turn of the Chaos Engineering Learning Loop.

1 Disclaimer: the author cofounded the Chaos Toolkit project a couple of years ago.

2 The experiments used in this book will be in JSON format, but in the accompanying samples repository on GitHub you will find the equivalent YAML experiment definitions as well.

3 You can delve into what Python virtual environments are and how they work in the official Python documentation. A virtual environment is a nicely isolated sandbox in which you can control the Python dependencies and any other modules needed for a specific project without fighting module and dependency conflicts with other projects you are working on.

Get Learning Chaos Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.