To fully understand how your cluster is working, and whether it is working effectively, there are a number of different statistics that you should monitor to diagnose and identify problems. Couchbase Server incorporates a huge range of statistics that provide very detailed and in-depth information on how your cluster is running.
Some of the statistics can be used on their own to provide advice and guidance; others may need to be monitored alongside other statistics to give a better picture. Others provide background information on how specific elements of your cluster are executing.
The architecture of Couchbase Server means that for the majority of statistics, you should monitor the cluster as a whole in the first instance.
Your monitoring should focus on:
Your whole cluster should be monitored to ensure that you are not running out of RAM, disk space, or I/O performance. A problem with any one of these items indicates that your cluster size should be increased.
You can monitor the overall performance from the main Cluster Overview page, shown in Figure 5-1.
You should monitor your individual nodes to ensure that they are not experiencing spikes in CPU, RAM, or disk I/O. This may indicate that your cluster is failing to cope with RAM or I/O requirements, or that a node needs to be taken out of service (manual failover) due to a software or hardware issue.
You can monitor the key node statistics from ...