This IBM Redbooks publication mainly provides information about IBM High Performance Computing (HPC) clusters. Therefore, in the rest part of this book, unless otherwise mentioned, the term cluster represents an IBM HPC cluster. A computer cluster is a group of connected computers that work together. In many respects, they act as a single system.
We describe the concepts, processes, and methodologies used to achieve and maintain a “healthy” state for an IBM HPC system, both in pre-production stage and production stage.In the context ...

Get IBM High Performance Computing Cluster Health Check now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.