In this chapter, we focus on the methodology and steps needed to perform a successful first-level system debugging and analysis. We will be using system logs and statistics to try to understand the manifestation of a problem.
Profile the system status
Previous chapters have taught us the necessary models when approaching what may appear to be a problem in our environment. The idea is to carefully isolate the problem, reduce it to a minimal set of variables, and then use industry-accepted methods to prove and disprove your theories. Now, we will learn about the tools that can help us in our quest.
Typically, data center hosts are configured to ...
Get Problem-solving in High Performance Computing now with O’Reilly online learning.
O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.