If Linux isn't built into your cluster software, the
first step is to decide what distribution and version of Linux you
want.
This decision will depend on
what clustering software you want to use. It doesn't
matter what the "best" distribution
of Linux (Red Hat, Debian, SUSE, Mandrake, etc.) or version (7.3,
8.0, 9.0, etc.) is in some philosophical sense if the clustering
software you want to use isn't available for that
choice. This book uses the Red Hat distribution because the
clustering software being discussed was known to work with that
distribution. This is not an endorsement of Red Hat; it was just a
pragmatic decision.
Keep in mind that your users typically won't be
logging onto the compute nodes to develop programs, etc., so the
version of Linux used there should be largely irrelevant to the
users. While users will be logging onto the head node, this is not a
general-purpose server. They won't be reading email,
writing memos, or playing games on this system (hopefully).
Consequently, many of the reasons someone might prefer a particular
distribution are irrelevant.
This same pragmatism should extend to
selecting the version as well as the distribution you use. In
practice, this may mean using an older version of Linux. There are
basically three issues involved in using an older
version—compatibility with newer hardware; bug fixes, patches,
and continued support; and compatibility with clustering software.
If you are using recycled hardware, using an older version
shouldn't be a problem since drivers should be
readily available for your older equipment. If you are using new
equipment, however, you may run into problems with older Linux
releases. The best solution, of course, is to avoid this problem by
planning ahead if you are buying new hardware. This is something you
should be able to work around by putting together a single test
system before buying the bulk of the equipment.
With older versions, many of the problems are known. For bugs, this
is good news since someone else is likely to have already developed a
fix or workaround. With security holes, this is bad news since
exploits are probably well circulated. With an older version,
you'll need to review and install all appropriate
security patches. If you can isolate your cluster, this will be less
of an issue.