In this chapter, we focus on the methodology and steps needed to perform a successful first-level system debugging and analysis. We will be using system logs and statistics to try to understand the manifestation of a problem.
Profile the system status
Previous chapters have taught us the necessary models when approaching what may appear to be a problem in our environment. The idea is to carefully isolate the problem, reduce it to a minimal set of variables, and then use industry-accepted methods to prove and disprove your theories. Now, we will learn about the tools that can help us in our quest.
Typically, data center hosts are configured to ...
Get Problem-solving in High Performance Computing now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.