Chapter 12. Maintenance

In this chapter, we look at some things you can do to keep your Cassandra cluster healthy. Our goal here is to provide an overview of the various maintenance tasks available. Because the specific procedures for these tasks tend to change slightly from release to release, you’ll want to make sure to consult the Cassandra documentation for the release you’re using to make sure you’re not missing any new steps.

Let’s put our operations hats on and get started!

Health Check

There are some basic things that you’ll want to look for to ensure that nodes in your cluster are healthy:

  • Use nodetool status to make sure all of the nodes are up and reporting normal status. Check the load column for each node to make sure the cluster is well balanced. An uneven number of nodes per rack can lead to an imbalanced cluster.

  • Check nodetool tpstats on your nodes for dropped messages, especially mutations, as this indicates that data writes may be lost. A growing number of blocked flush writers indicates the node is ingesting data into memory faster than it can be flushed to disk. Both of these conditions can indicate that Cassandra is having trouble keeping up with the load. As is usual with databases, once these problems begin, they tend to continue in a downward spiral. Three things that can improve the situation are a decreased load, scaling up (adding more hardware), or scaling out (adding another node and rebalancing).

If these checks indicate issues, you may need ...

Get Cassandra: The Definitive Guide, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.