Chapter 7. Node Failure and Post-Mortem Analysis

In the previous chapter, we learned how to troubleshoot common performance and reliability issues that come up when using Elasticsearch using case studies with real-world examples. This chapter explores some common causes of node and cluster failures. Specific topics covered are as follows:

  • How to determine the root cause of a failure
  • How to take corrective action for node failures
  • Case studies with real-world examples of diagnosing system failures

Diagnosing problems

Elasticsearch node failures can manifest in many different ways. Some of the symptoms of node failures are as follows:

  • A node crashes during heavy data indexing
  • Elasticsearch process stops running for an unknown reason
  • A cluster won't recover ...

Get Monitoring Elasticsearch now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.