Chapter 10. Management Software

Now that you have a cluster, you are going to want to keep it running, which will involve a number of routine system administration tasks. If you have done system administration before, then for the most part you won’t be doing anything new. The administrative tasks you’ll face are largely the same tasks you would face with any multiuser system. It is just that these tasks will be multiplied by the number of machines in your cluster. While creating 25 new accounts on a server may not sound too hard, when you have to duplicate those accounts on each node in a 200-node cluster, you’ll probably want some help.

For a small cluster with only a few users, you may be able to get by doing things the way you are used to doing them. But why bother? The tools in this chapter are easy to install and use. Mastering them, which won’t take long, will lighten your workload.

While there are a number of tools available, two representative tools (or tool sets) are described in this chapter—the Cluster Command and Control (C3) tools set and Ganglia. C3 is a set of utilities that can be used to automate a number of tasks across a cluster or multiple clusters, such as executing the same command on every machine or distributing files to every machine. Ganglia is used to monitor the health of your cluster from a single node using a web-based interface.

C3

Cluster Command and Control is a set of about a dozen command-line utilities used to execute common management tasks. ...

Get High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.