High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI

Chapter 4. Linux for Clusters

This chapter reviews some of the issues involved in setting up a Linux system for use in a cluster. While several key services are described in detail, for the most part the focus is more on the issues and rationales than on specifics. Even if you are an old pro at Linux system administration, you may still want to skim this chapter for a quick overview of the issues as they relate to clusters, particularly the section on configuring services. If you are new to Linux system administration, this chapter will probably seem very terse. What’s presented here is the bare minimum a novice system administrator will need to get started. The Appendix A lists additional sources.

This chapter covers material you’ll need when setting up the head node and a typical cluster node. Depending on the approach you take, much of this may be done for you. If you are building your cluster from the ground up, you’ll need to install the head node, configure the individual services on it, and build at least one compute node. Once you have determined how a compute node should be configured, you can turn to Chapter 8 for a discussion of how to duplicate systems in an efficient manner. It is much simpler with kits like OSCAR and Rocks.

With OSCAR, you’ll need to install Linux on the head system, but OSCAR will configure the services for you. It will also build the client, i.e., generate a system image and install it on the compute nodes. OSCAR will configure and install most of the packages you’ll need. The key to using OSCAR is to use a version of Linux that is known to be compatible with OSCAR. OSCAR is described in Chapter 6. With Rocks, described in Chapter 7, everything will be done for you. Red Hat Linux comes as part of the Rocks distribution.

This chapter begins with a discussion of selecting a Linux distribution. A general discussion of installing Linux follows. Next, the configuration of relevant network services is described. Finally, there is a brief discussion of security. If you are adding clustering software to an existing collection of workstations, presumably Linux is already installed on your machines. If this is the case, you can probably skim the first couple of sections. But while you won’t need to install Linux, you will need to ensure that it is configured correctly and all the services you’ll need are available.

Installing Linux

If Linux isn’t built into your cluster software, the first step is to decide what distribution and version of Linux you want.

Selecting a Distribution

This decision will depend on what clustering software you want to use. It doesn’t matter what the “best” distribution of Linux (Red Hat, Debian, SUSE, Mandrake, etc.) or version (7.3, 8.0, 9.0, etc.) is in some philosophical sense if the clustering software you want to use isn’t available for that choice. This book uses the Red Hat distribution because the clustering software being discussed was known to work with that distribution. This is not an endorsement of Red Hat; it was just a pragmatic decision.

Keep in mind that your users typically won’t be logging onto the compute nodes to develop programs, etc., so the version of Linux used there should be largely irrelevant to the users. While users will be logging onto the head node, this is not a general-purpose server. They won’t be reading email, writing memos, or playing games on this system (hopefully). Consequently, many of the reasons someone might prefer a particular distribution are irrelevant.

This same pragmatism should extend to selecting the version as well as the distribution you use. In practice, this may mean using an older version of Linux. There are basically three issues involved in using an older version—compatibility with newer hardware; bug fixes, patches, and continued support; and compatibility with clustering software.

If you are using recycled hardware, using an older version shouldn’t be a problem since drivers should be readily available for your older equipment. If you are using new equipment, however, you may run into problems with older Linux releases. The best solution, of course, is to avoid this problem by planning ahead if you are buying new hardware. This is something you should be able to work around by putting together a single test system before buying the bulk of the equipment.

With older versions, many of the problems are known. For bugs, this is good news since someone else is likely to have already developed a fix or workaround. With security holes, this is bad news since exploits are probably well circulated. With an older version, you’ll need to review and install all appropriate security patches. If you can isolate your cluster, this will be less of an issue.

Unfortunately, at some point you can expect support for older systems to be discontinued. However, a system will not stop working just because it isn’t supported. While not desirable, this is also something you can live with.

The final and key issue is software compatibility. Keep in mind that it takes time to develop software for use with a new release, particularly if you are customizing the kernel. As a result, the clustering software you want to use may not be available for the latest version of your favorite Linux distribution. In general, software distributed as libraries (e.g., MPI) are more forgiving than software requiring kernel patches (e.g., openMosix) or software that builds kernel modules (e.g., PVFS). These latter categories, by their very nature, must be system specific. Remember that using clustering software is the raison d'être for your cluster. If you can’t run it, you are out of business. Unless you are willing to port the software or compromise your standards, you may be forced to use an older version of Linux. While you may want the latest and greatest version of your favorite flavor of Linux, you need to get over it.

If at all feasible, it is best to start your cluster installation with a clean install of Linux. Of course, if you are adding clustering software to existing systems, this may not be feasible, particularly if the machines are not dedicated to the cluster. If that is the case, you’ll need to tread lightly. You’ll almost certainly need to make changes to these systems, changes that may not go as smoothly as you’d like. Begin by backing up and carefully documenting these systems.

Downloading Linux

With most flavors of Linux, there are several ways you can do the installation. Typically you can install from a set of CD-ROMs, from a hard disk partition, or over a network using NFS, FTP, or HTTP. The decision will depend in part on the hardware you have available, but for initial experimentation it is probably easiest to use CD-ROMs. Buying a boxed set can be a real convenience, particularly if it comes with a printed set of manuals. But if you are using an older version of Linux, finding a set of CD-ROMs to buy can be difficult. Fortunately, you should have no trouble finding what you need on the Internet.

Downloading is the cheapest and easiest way to go if you have a fast Internet connection and a CD-ROM burner. Typically, you download ISO images—disk images for CD-ROMs. These are basically single-file archives of everything on a CD-ROM. Since ISO images are frequently over 600 MB each and since you’ll need several of them, downloading can take hours even if you have a fast connection and days if you’re using a slow modem.

If you decide to go this route, follow the installation directions from your download site. These should help clarify exactly what you need and don’t need and explain any other special considerations. For example, for Red Hat Linux the place to start is http://www.redhat.com/apps/download/. This will give you a link to a set of directions with links to download sites. Don’t overlook the mirror sites; your download may go faster with them than with Red Hat’s official download site.

For Red Hat Linux 9.0, there are seven disks. (Earlier versions of Red Hat have fewer disks.) Three of these are the installation disks and are essential. Three disks contain the source files for the packages. It is very unlikely you’ll ever need these. If you do, you can download them later. The last disk is a documentation disk. You’d be foolish to skip this disk. Since the files only fill a small part of a CD, the ISO image is relatively small and the download doesn’t take very long.

It is a good idea to check the MD5SUM for each ISO you download. Run the md5sum program and compare the results to published checksums.

[root@cs sloanjd]# md5sum FC2-i386-rescuecd.iso
22f4bfca5baefe89f0e04166e738639f  FC2-i386-rescuecd.iso

This will ensure both that the disk image hasn’t been tampered with and that your download wasn’t corrupted.

Once you have downloaded the ISO images, you’ll need to burn your CD-ROMs. If you downloaded the ISO images to a Windows computer, you could use something like Roxio Easy Creator.^[1] If you already have a running Linux system, you might use X-CD-Roast.

Once you have the CD-ROMs, you can do an installation by following the appropriate directions for your software and system. Usually, this means booting to the first CD-ROM, which, in turn, runs an installation script. If you can’t boot from the CD-ROM, you’ll need to create a boot floppy using the directions supplied with the software. For Red Hat Linux, see the README file on the first installation disk.

What to Install?

What you install will depend on how you plan to use the machine. Is this a dedicated cluster? If so, users probably won’t log onto individual machines, so you can get by with installing the minimal software required to run applications on each compute node. Is it a cluster of workstations that will be used in other ways? If that is the case, be sure to install X and any other appropriate applications. Will you be writing code? Don’t forget the software development package and editors. Will you be recompiling the kernel? If so, you’ll need the kernel sources.^[2] If you are building kernel modules, you’ll need the kernel header files. (In particular, these are needed if you install PVFS. PVFS is described in Chapter 12.) A custom installation will give you the most control over what is installed, i.e., the greatest opportunity to install software that you don’t need and omit that which you do need.

Keep in mind that you can go back and add software. You aren’t trapped by what you include at this point. At this stage, the important thing is to remember what you actually did. Take careful notes and create a checklist as you proceed. The quickest way to get started is to take a minimalist approach and add anything you need later, but some people find it very annoying to have to go back and add software. If you have the extra disk space (2 GB or so), then you may want to copy all the packages to a directory on your server. Not having to mount disks and search for packages greatly simplifies adding packages as needed. You only need to do this with one system and it really doesn’t take that long. Once you have worked out the details, you can create a Kickstart configuration file to automate all this. Kickstart is described in more detail in Chapter 8.

Configuring Services

Once you have the basic installation completed, you’ll need to configure the system. Many of the tasks are no different for machines in a cluster than for any other system. For other tasks, being part of a cluster impacts what needs to be done. The following subsections describe the issues associated with several services that require special considerations. These subsections briefly recap how to configure and use these services. Remember, most of this will be done for you if you are using a package like OSCAR or Rocks. Still, it helps to understand the issues and some of the basics.

DHCP

Dynamic Host Configuration Protocol (DHCP) is used to supply network configuration parameters, including IP addresses, host names, and other information to clients as they boot. With clusters, the head node is often configured as a DHCP server and the compute nodes as DHCP clients. There are two reasons to do this. First, it simplifies the installation of compute nodes since the information DHCP can supply is often the only thing that is different among the nodes. Since a DHCP server can handle these differences, the node installation can be standardized and automated. A second advantage of DHCP is that it is much easier to change the configuration of the network. You simply change the configuration file on the DHCP server, restart the server, and reboot each of the compute nodes.

The basic installation is rarely a problem. The DHCP system can be installed as a part of the initial Linux installation or after Linux has been installed. The DHCP server configuration file, typically /etc/dhcpd.conf, controls the information distributed to the clients. If you are going to have problems, the configuration file is the most likely source.

The DHCP configuration file may be created or changed automatically when some cluster software is installed. Occasionally, the changes may not be done optimally or even correctly so you should have at least a reading knowledge of DHCP configuration files. Here is a heavily commented sample configuration file that illustrates the basics. (Lines starting with “#” are comments.)

# A sample DHCP configuration file.
   
# The first commands in this file are global, 
# i.e., they apply to all clients.
   
# Only answer requests from known machines,
# i.e., machines whose hardware addresses are given.
deny unknown-clients;
   
# Set the subnet mask, broadcast address, and router address.
option subnet-mask 255.255.255.0;
option broadcast-address 172.16.1.255;
option routers 172.16.1.254;
   
# This section defines individual cluster nodes.
# Each subnet in the network has its own section.
subnet 172.16.1.0 netmask 255.255.255.0 {
        group {
                # The first host, identified by the given MAC address,
                # will be named node1.cluster.int, will be given the
                # IP address 172.16.1.1, and will use the default router
                # 172.16.1.254 (the head node in this case).
                host node1{
                        hardware ethernet 00:08:c7:07:68:48;
                        fixed-address 172.16.1.1;
                        option routers 172.16.1.254;
                        option domain-name "cluster.int";
                }
                host node2{
                        hardware ethernet 00:08:c7:07:c1:73;
                        fixed-address 172.16.1.2;
                        option routers 172.16.1.254;
                        option domain-name "cluster.int";
                }
                # Additional node definitions go here.
        }
}
   
# For servers with multiple interfaces, this entry says to ignore requests
# on specified subnets.
subnet 10.0.32.0 netmask 255.255.248.0 {  not authoritative; }

As shown in this example, you should include a subnet section for each subnet on your network. If the head node has an interface for the cluster and a second interface connected to the Internet or your organization’s network, the configuration file will have a group for each interface or subnet. Since the head node should answer DHCP requests for the cluster but not for the organization, DHCP should be configured so that it will respond only to DHCP requests from the compute nodes.

NFS

A network filesystem is a filesystem that physically resides on one computer (the file server), which in turn shares its files over the network with other computers on the network (the clients). The best-known and most common network filesystem is Network File System (NFS). In setting up a cluster, designate one computer as your NFS server. This is often the head node for the cluster, but there is no reason it has to be. In fact, under some circumstances, you may get slightly better performance if you use different machines for the NFS server and head node. Since the server is where your user files will reside, make sure you have enough storage. This machine is a likely candidate for a second disk drive or raid array and a fast I/O subsystem. You may even what to consider mirroring the filesystem using a small high-availability cluster.

Why use an NFS? It should come as no surprise that for parallel programming you’ll need a copy of the compiled code or executable on each machine on which it will run. You could, of course, copy the executable over to the individual machines, but this quickly becomes tiresome. A shared filesystem solves this problem. Another advantage to an NFS is that all the files you will be working on will be on the same system. This greatly simplifies backups. (You do backups, don’t you?) A shared filesystem also simplifies setting up SSH, as it eliminates the need to distribute keys. (SSH is described later in this chapter.) For this reason, you may want to set up NFS before setting up SSH. NFS can also play an essential role in some installation strategies.

If you have never used NFS before, setting up the client and the server are slightly different, but neither is particularly difficult. Most Linux distributions come with most of the work already done for you.

Running NFS

Begin with the server; you won’t get anywhere with the client if the server isn’t already running. Two things need to be done to get the server running. The file /etc/exports must be edited to specify which machines can mount which directories, and then the server software must be started. Here is a single line from the file /etc/exports on the server amy:

/home    basil(rw) clara(rw) desmond(rw) ernest(rw) george(rw)

This line gives the clients basil, clara, desmond, ernest, and george read/write access to the directory /home on the server. Read access is the default. A number of other options are available and could be included. For example, the no_root_squash option could be added if you want to edit root permission files from the nodes.

Warning

Pay particular attention to the use of spaces in this file.

Had a space been inadvertently included between basil and (rw), read access would have been granted to basil and read/write access would have been granted to all other systems. (Once you have the systems set up, it is a good idea to use the command showmount -a to see who is mounting what.)

Once /etc/exports has been edited, you’ll need to start NFS. For testing, you can use the service command as shown here

[root@fanny init.d]# /sbin/service nfs start
Starting NFS services:                                     [  OK  ]
Starting NFS quotas:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
[root@fanny init.d]# /sbin/service nfs status
rpc.mountd (pid 1652) is running...
nfsd (pid 1666 1665 1664 1663 1662 1661 1660 1657) is running...
rpc.rquotad (pid 1647) is running...

(With some Linux distributions, when restarting NFS, you may find it necessary to explicitly stop and restart both nfslock and portmap as well.) You’ll want to change the system configuration so that this starts automatically when the system is rebooted. For example, with Red Hat, you could use the serviceconf or chkconfig commands.

For the client, the software is probably already running on your system. You just need to tell the client to mount the remote filesystem. You can do this several ways, but in the long run, the easiest approach is to edit the file /etc/fstab, adding an entry for the server. Basically, you’ll add a line to the file that looks something like this:

amy:/home    /home    nfs    rw,soft    0 0

In this example, the local system mounts the /home filesystem located on amy as the /home directory on the local machine. The filesystems may have different names. You can now manually mount the filesystem with the mount command

[root@ida /]# mount /home

When the system reboots, this will be done automatically.

When using NFS, you should keep a couple of things in mind. The mount point, /home, must exist on the client prior to mounting. While the remote directory is mounted, any files that were stored on the local system in the /home directory will be inaccessible. They are still there; you just can’t get to them while the remote directory is mounted. Next, if you are running a firewall, it will probably block NFS traffic. If you are having problems with NFS, this is one of the first things you should check.

File ownership can also create some surprises. User and group IDs should be consistent among systems using NFS, i.e., each user will have identical IDs on all systems. Finally, be aware that root privileges don’t extend across NFS shared systems (if you have configured your systems correctly). So if, as root, you change the directory (cd) to a remotely mounted filesystem, don’t expect to be able to look at every file. (Of course, as root you can always use su to become the owner and do all the snooping you want.) Details for the syntax and options can be found in the nfs(5), exports(5), fstab(5), and mount(8) manpages. Additional references can be found in the Appendix A.

Automount

The preceding discussion of NFS describes editing the /etc/fstab to mount filesystems. There’s another alternative—using an automount program such as autofs or amd. An automount daemon mounts a remote filesystem when an attempt is made to access the filesystem and unmounts the filesystem when it is no longer needed. This is all transparent to the user.

While the most common use of automounting is to automatically mount floppy disks and CD-ROMs on local machines, there are several advantages to automounting across a network in a cluster. You can avoid the problem of maintaining consistent /etc/fstab files on dozens of machines. Automounting can also lessen the impact of a server crash. It is even possible to replicate a filesystem on different servers for redundancy. And since a filesystem is mounted only when needed, automounting can reduce network traffic. We’ll look at a very simple example here. There are at least two different HOWTOs (http://www.tldp.org/) for automounting should you need more information.

Automounting originated at Sun Microsystems, Inc. The Linux automounter autofs, which mimics Sun’s automounter, is readily available on most Linux systems. While other automount programs are available, most notably amd, this discussion will be limited to using autofs.

Support for autofs must be compiled into the kernel before it can be used. With most Linux releases, this has already been done. If in doubt, use the following to see if it is installed:

[root@fanny root]# cat /proc/filesystems
...

Somewhere in the output, you should see the line

nodev   autofs

If you do, you are in business. Otherwise, you’ll need a new kernel.

Next, you need to configure your systems. autofs uses the file /etc/auto.master to determine mount points. Each line in the file specifies a mount point and a map file that defines which filesystems will be mounted to the mount point. For example, in Rocks the auto.master file contains the single line:

/home auto.home --timeout 600

In this example, /home is the mount point, i.e., where the remote filesystem will be mounted. The file auto.home specifies what will be mounted.

In Rocks, the file /etc/auto.home will have multiple entries such as:

sloanjd  frontend.local:/export/home/sloanjd

The first field is the name of the subdirectory that will be created under the original mount point. In this example, the directory sloanjd will be mounted as a subdirectory of /home on the client system. The subdirectories are created dynamically by automount and should not exist on the client. The second field is the hostname (or server) and directory that is exported. (Although not shown in this example, it is possible to specify mount parameters for each directory in /etc/auto.home.) NFS should be running and you may need to update your /etc/exports file.

Once you have the configuration files copied to each system, you need to start autofs on each system. autofs is usually located in /etc/init.d and accepts the commands start, restart, status, and reload. With Red Hat, it is available through the /sbin/service command. After reading the file, autofs starts an automount process with appropriate parameters for each mount point and mounts filesystems as needed. For more information see the autofs(8) and auto.master(5) manpages.

Other Cluster File System

NFS has its limitations. First, there are potential security issues. Since the idea behind NFS is sharing, it should come as no surprise that over the years crackers have found ways to exploit NFS. If you are going to use NFS, it is important that you use a current version, apply any needed patches, and configure it correctly.

Also, NFS does not scale well, although there seems to be some disagreement about its limitations. For clusters, with fewer than 100 nodes, NFS is probably a reasonable choice. For clusters with more than 1,000 nodes, NFS is generally thought to be inadequate. Between 100 and 1,000 nodes, opinions seem to vary. This will depend in part on your hardware. It will also depend on how your applications use NFS. For a bioinformatics clusters, many of the applications will be read intensive. For a graphics processing cluster, rendering applications will be write intensive. You may find that NFS works better with the former than the latter. Other applications will have different characteristics, each stressing the filesystem in a different way. Ultimately, it comes down to what works best for you and your applications, so you’ll probably want to do some experimenting.

Keep in mind that NFS is not meant to be a high-performance, parallel filesystem. Parallel filesystems are designed for a different purpose. There are other filesystems you could consider, each with its own set of characteristics. Some of these are described briefly in Chapter 12. Additionally, there are other storage technologies such as storage area network (SAN) technology. SANs offer greatly improve filesystem failover capabilities and are ideal for use with high-availability clusters. Unfortunately, SANs are both expensive and difficult to set up. iSCSI (SCSI over IP) is an emerging technology to watch.

If you need a high-performance, parallel filesystems, PVFS is a reasonable place to start, as it is readily available for both Rocks and OSCAR. PVFS is discussed in Chapter 12.

SSH

To run software across a cluster, you’ll need some mechanism to start processes on each machine. In practice, a prerequisite is the ability to log onto each machine within the cluster. If you need to enter a password for each machine each time you run a program, you won’t get very much done. What is needed is a mechanism that allows logins without passwords.

This boils down to two choices—you can use remote shell (RSH) or secure shell (SSH). If you are a trusting soul, you may want to use RSH. It is simpler to set up with less overhead. On the other hand, SSH network traffic is encrypted, so it is safe from snooping. Since SSH provides greater security, it is generally the preferred approach.

SSH provides mechanisms to log onto remote machines, run programs on remote machines, and copy files among machines. SSH is a replacement for ftp, telnet, rlogin, rsh, and rcp. A commercial version of SSH is available from SSH Communications Security (http://www.ssh.com), a company founded by Tatu Ylönen, an original developer of SSH. Or you can go with OpenSSH, an open source version from http://www.openssh.org.

OpenSSH is the easiest since it is already included with most Linux distributions. It has other advantages as well. By default, OpenSSH automatically forwards the DISPLAY variable. This greatly simplifies using the X Window System across the cluster. If you are running an SSH connection under X on your local machine and execute an X program on the remote machine, the X window will automatically open on the local machine. This can be disabled on the server side, so if it isn’t working, that is the first place to look.

There are two sets of SSH protocols, SSH-1 and SSH-2. Unfortunately, SSH-1 has a serious security vulnerability. SSH-2 is now the protocol of choice. This discussion will focus on using OpenSSH with SSH-2.

Before setting up SSH, check to see if it is already installed and running on your system. With Red Hat, you can check to see what packages are installed using the package manager.

[root@fanny root]# rpm -q -a | grep ssh
openssh-3.5p1-6
openssh-server-3.5p1-6
openssh-clients-3.5p1-6
openssh-askpass-gnome-3.5p1-6
openssh-askpass-3.5p1-6

This particular system has the SSH core package, both server and client software as well as additional utilities. The SSH daemon is usually started as a service. As you can see, it is already running on this machine.

[root@fanny root]# /sbin/service sshd status
sshd (pid 28190 1658) is running...

Of course, it is possible that it wasn’t started as a service but is still installed and running. You can use ps to double check.

[root@fanny root]# ps -aux | grep ssh
root     29133  0.0  0.2  3520  328 ?        S    Dec09   0:02 /usr/sbin/sshd
...

Again, this shows the server is running.

With some older Red Hat installations, e.g., the 7.3 workstation, only the client software is installed by default. You’ll need to manually install the server software. If using Red Hat 7.3, go to the second install disk and copy over the file RedHat/RPMS/openssh-server-3.1p1-3.i386.rpm. (Better yet, download the latest version of this software.) Install it with the package manager and then start the service.

[root@james root]# rpm -vih openssh-server-3.1p1-3.i386.rpm
Preparing...                ########################################### [100%]
   1:openssh-server         ########################################### [100%]
[root@james root]# /sbin/service sshd start
Generating SSH1 RSA host key:                              [  OK  ]
Generating SSH2 RSA host key:                              [  OK  ]
Generating SSH2 DSA host key:                              [  OK  ]
Starting sshd:                                             [  OK  ]

When SSH is started for the first time, encryption keys for the system are generated. Be sure to set this up so that it is done automatically when the system reboots.

Configuration files for both the server, sshd_config, and client, ssh_config, can be found in /etc/ssh, but the default settings are usually quite reasonable. You shouldn’t need to change these files.

Using SSH

To log onto a remote machine, use the command ssh with the name or IP address of the remote machine as an argument. The first time you connect to a remote machine, you will receive a message with the remote machines’ fingerprint, a string that identifies the machine. You’ll be asked whether to proceed or not. This is normal.

[root@fanny root]# ssh amy
The authenticity of host 'amy (10.0.32.139)' can't be established.
RSA key fingerprint is 98:42:51:3e:90:43:1c:32:e6:c4:cc:8f:4a:ee:cd:86.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'amy,10.0.32.139' (RSA) to the list of known hosts.
root@amy's password: 
Last login: Tue Dec  9 11:24:09 2003
[root@amy root]#

The fingerprint will be recorded in a list of known hosts on the local machine. SSH will compare fingerprints on subsequent logins to ensure that nothing has changed. You won’t see anything else about the fingerprint unless it changes. Then SSH will warn you and query whether you should continue. If the remote system has changed, e.g., if it has been rebuilt or if SSH has been reinstalled, it’s OK to proceed. But if you think the remote system hasn’t changed, you should investigate further before logging in.

Notice in the last example that SSH automatically uses the same identity when logging into a remote machine. If you want to log on as a different user, use the -l option with the appropriate account name.

You can also use SSH to execute commands on remote systems. Here is an example of using date remotely.

[root@fanny root]# ssh -l sloanjd hector date
sloanjd@hector's password: 
Mon Dec 22 09:28:46 EST 2003

Notice that a different account, sloanjd, was used in this example.

To copy files, you use the scp command. For example,

[root@fanny root]# scp /etc/motd george:/root/
root@george's password: 
motd                 100% |*****************************|     0       00:00

Here file /etc/motd was copied from fanny to the /root directory on george.

In the examples thus far, the system has asked for a password each time a command was run. If you want to avoid this, you’ll need to do some extra work. You’ll need to generate a pair of authorization keys that will be used to control access and then store these in the directory ~/.ssh. The ssh-keygen command is used to generate keys.

[sloanjd@fanny sloanjd]$ ssh-keygen -b1024 -trsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/sloanjd/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/sloanjd/.ssh/id_rsa.
Your public key has been saved in /home/sloanjd/.ssh/id_rsa.pub.
The key fingerprint is:
2d:c8:d1:e1:bc:90:b2:f6:6d:2e:a5:7f:db:26:60:3f sloanjd@fanny
[sloanjd@fanny sloanjd]$ cd .ssh
[sloanjd@fanny .ssh]$ ls -a
.  ..  id_rsa  id_rsa.pub  known_hosts

The options in this example are used to specify a 1,024-bit key and the RSA algorithm. (You can use DSA instead of RSA if you prefer.) Notice that SSH will prompt you for a passphrase, basically a multi-word password.

Two keys are generated, a public and a private key. The private key should never be shared and resides only on the client machine. The public key is distributed to remote machines. Copy the public key to each system you’ll want to log onto, renaming it authorized_keys2.

[sloanjd@fanny .ssh]$ cp id_rsa.pub authorized_keys2
[sloanjd@fanny .ssh]$ chmod go-rwx authorized_keys2
[sloanjd@fanny .ssh]$ chmod 755 ~/.ssh

If you are using NFS, as shown here, all you need to do is copy and rename the file in the current directory. Since that directory is mounted on each system in the cluster, it is automatically available.

Tip

If you used the NFS setup described earlier, root’s home directory/root, is not shared. If you want to log in as root without a password, manually copy the public keys to the target machines. You’ll need to decide whether you feel secure setting up the root account like this.

You will use two utilities supplied with SSH to manage the login process. The first is an SSH agent program that caches private keys, ssh-agent. This program stores the keys locally and uses them to respond to authentication queries from SSH clients. The second utility, ssh-add, is used to manage the local key cache. Among other things, it can be used to add, list, or remove keys.

[sloanjd@fanny .ssh]$ ssh-agent $SHELL
[sloanjd@fanny .ssh]$ ssh-add
Enter passphrase for /home/sloanjd/.ssh/id_rsa: 
Identity added: /home/sloanjd/.ssh/id_rsa (/home/sloanjd/.ssh/id_rsa)

(While this example uses the $SHELL variable, you can substitute the actual name of the shell you want to run if you wish.) Once this is done, you can log in to remote machines without a password.

This process can be automated to varying degrees. For example, you can add the call to ssh-agent as the last line of your login script so that it will be run before you make any changes to your shell’s environment. Once you have done this, you’ll need to run ssh-add only when you log in. But you should be aware that Red Hat console logins don’t like this change.

You can find more information by looking at the ssh(1), ssh-agent(1), and ssh-add(1) manpages. If you want more details on how to set up ssh-agent, you might look at SSH, The Secure Shell by Barrett and Silverman, O’Reilly, 2001. You can also find scripts on the Internet that will set up a persistent agent so that you won’t need to rerun ssh-add each time.

Warning

One last word of warning: If you are using ssh-agent, it becomes very important that you log off whenever you leave your machine. Otherwise, you’ll be leaving not just one system wide open, but all of your systems.

Other Services and Configuration Tasks

Thus far, we have taken a minimalist approach. To make like easier, there are several other services that you’ll want to install and configure. There really isn’t anything special that you’ll need to do—just don’t overlook these.

Apache

While an HTTP server may seem unnecessary on a cluster, several cluster management tools such as Clumon and Ganglia use HTTP to display results. If you will monitor your cluster only from the head node, you may be able to get by without installing a server. But if you want to do remote monitoring, you’ll need to install an HTTP server. Since most management packages like these assume Apache will be installed, it is easiest if you just go ahead and set it up when you install your cluster.

Network Time Protocol (NTP)

It is important to have synchronized clocks on your cluster, particularly if you want to do performance monitoring or profiling. Of course, you don’t have to synchronize your system to the rest of the world; you just need to be internally consistent. Typically, you’ll want to set up the head node as an NTP server and the compute nodes as NTP clients. If you can, you should sync the head node to an external timeserver. The easiest way to handle this is to select the appropriate option when you install Linux. Then make sure that the NTP daemon is running:

[root@fanny root]# /sbin/service ntpd status
ntpd (pid 1689) is running...

Start the daemon if necessary.

Virtual Network Computing (VNC)

This is a very nice package that allows remote graphical logins to your system. It is available as a Red Hat package or from http://www.realvnc.com/. VNC can be tunneled using SSH for greater security.

Multicasting

Several clustering utilities use multicasting to distribute data among nodes within a cluster, either for cloning systems or when monitoring systems. In some instances, multicasting can greatly increase performance. If you are using a utility that relies on multicasting, you’ll need to ensure that multicasting is supported. With Linux, multicasting must be enabled when the kernel is built. With most distributions, this is not a problem. Additionally, you will need to ensure that an appropriate multicast entry is included in your route tables. You will also need to ensure that your networking equipment supports multicast. This won’t be a problem with hubs; this may be a problem with switches; and, should your cluster span multiple networks, this will definitely be an issue with routers. Since networking equipment varies significantly from device to device, you need to consult the documentation for your specific hardware. For more general information on multicasting, you should consult the multicasting HOWTOs.

Hosts file and name services

Life will be much simpler in the long run if you provide appropriate name services. NIS is certainly one possibility. At a minimum, don’t forget to edit /etc/hosts for your cluster. At the very least, this will reduce network traffic and speed up some software. And some packages assume it is correctly installed. Here are a few lines from the host file for amy:

127.0.0.1               localhost.localdomain localhost
10.0.32.139             amy.wofford.int         amy
10.0.32.140             basil.wofford.int       basil
...

Notice that amy is not included on the line with localhost. Specifying the host name as an alias for localhost can break some software.

Cluster Security

Security is always a two-edged sword. Adding security always complicates the configuration of your systems and makes using a cluster more difficult. But if you don’t have adequate security, you run the risk of losing sensitive data, losing control of your cluster, having it damaged, or even having to completely rebuild it. Security management is a balancing act, one of trying to figure out just how little security you can get by with.

As previously noted, the usual architecture for a cluster is a set of machines on a dedicated subnet. One machine, the head node, connects this network to the outside world, i.e., the organization’s network and the Internet. The only access to the cluster’s dedicated subnet is through the head node. None of the compute nodes are attached to any other network. With this model, security typically lies with the head node. The subnet is usually a trust-based open network.

There are several reasons for this approach. With most clusters, the communication network is the bottleneck. Adding layers of security to this network will adversely affect performance. By focusing on the head node, security administration is localized and thus simpler. Typically, with most clusters, any sensitive information resides on the head node, so it is the point where the greatest level of protection is needed. If the compute nodes are not isolated, each one will need to be secured from attack.

This approach also simplifies setting up packet filtering, i.e., firewalls. Incorrectly configured, packet filters can create havoc within your cluster. Determining what traffic to allow can be a formidable challenge when using a number of different applications. With the isolated network approach, you can configure the internal interface to allow all traffic and apply the packet filter only to public interface.

This approach doesn’t mean you have a license to be sloppy within the cluster. You should take all reasonable precautions. Remember that you need to protect the cluster not just from external threats but from internal ones as well—whether intentional or otherwise.

Since a thorough discussion of security could easily add a few hundred pages to this book, it is necessary to assume that you know the basics of security. If you are a novice system administrator, this is almost certainly not the case, and you’ll need to become proficient as quickly as possible. To get started, you should:

Be sure to apply all appropriate security patches, at least to the head node, and preferably to all nodes. This is a task you will need to do routinely, not just when you set up the cluster.
Know what is installed on your system. This can be a particular problem with cluster kits. Audit your systems regularly.
Differentiate between what’s available inside the cluster and what is available outside the cluster. For example, don’t run NFS outside the cluster. Block portmapper on the public interface of the head node.
Don’t put too much faith in firewalls, but use one, at least on the head node’s public interface, and ensure that it is configured correctly.
Don’t run services that you don’t need. Routinely check which services are running, both with netstat and with a port scanner like nmap .
Your head node should be dedicated to the cluster, if at all possible. Don’t set it up as a general server.
Use the root account only when necessary. Don’t run programs as root unless it is absolutely necessary.

There is no easy solution to the security dilemma. While you may be able to learn enough, you’ll never be able to learn it all.

^[1]There is an appealing irony to using Windows to download Linux.

^[2]In general, you should avoid recompiling the kernel unless it is absolutely necessary. While you may be able to eke out some modest performance gains, they are rarely worth the effort.

Get High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI by Joseph D Sloan

Chapter 4. Linux for Clusters

Installing Linux

Selecting a Distribution

Downloading Linux

What to Install?

Configuring Services

DHCP

NFS

Running NFS

Warning

Automount

Other Cluster File System

SSH

Using SSH

Tip

Warning

Other Services and Configuration Tasks

Apache

Network Time Protocol (NTP)

Virtual Network Computing (VNC)

Multicasting

Hosts file and name services

Cluster Security

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly