Chapter 12. Maintenance
In this chapter, we look at some things you can do to keep your Cassandra cluster healthy. Our goal here is to provide an overview of the various maintenance tasks available. Because the specific procedures for these tasks tend to change slightly from release to release, you’ll want to make sure to consult the Cassandra documentation for the release you’re using to make sure you’re not missing any new steps.
Let’s put our operations hats on and get started!
There are some basic things that you’ll want to look for to ensure that nodes in your cluster are healthy:
nodetool statusto make sure all of the nodes are up and reporting normal status. Check the
loadcolumn for each node to make sure the cluster is well balanced. An uneven number of nodes per rack can lead to an imbalanced cluster.
nodetool tpstatson your nodes for dropped messages, especially mutations, as this indicates that data writes may be lost. A growing number of blocked flush writers indicates the node is ingesting data into memory faster than it can be flushed to disk. Both of these conditions can indicate that Cassandra is having trouble keeping up with the load. As is usual with databases, once these problems begin, they tend to continue in a downward spiral. Three things that can improve the situation are a decreased load, scaling up (adding more hardware), or scaling out (adding another node and rebalancing).
If these checks indicate issues, you may need ...