Chapter 8. Monitoring and Best Practices

With every production system, you need to be able to quickly identify and respond to issues. You must ensure that you monitor Aerospike to keep your applications running smoothly in case of hardware issues, software bugs, configuration problems, and networking issues. This chapter will focus on guiding you toward a monitoring solution, discuss vital metrics to monitor, and review how to best respond to some problem scenarios. By the end of this chapter, you should have a grasp on how to monitor your Aerospike deployment, how to upgrade your servers running Aerospike, and some first steps to respond to operational demands.

Monitoring

There are two sides to monitoring Aerospike, from the client application using it and from Aerospike itself. Both are important for a robust and complete monitoring solution. If you monitor only the Aerospike database and the metrics it reports, you won’t know if an operation fails to reach the server entirely. Similarly, if you only monitor the client application, you may miss out on critical information from the cluster itself or warning systems ramping up to an incident, such as lack of storage space.

Application Metrics

Within your Aerospike client application, you’ll want to add monitoring around the latency of your calls and success rate, and you’ll want to subscribe to the Aerospike logging interface. The latency and success rate measurements are up to you to implement by adding stopwatch timers, catching ...

Get Aerospike: Up and Running now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.