136 IBM ^ Cluster 1600 Managed by PSSP 3.5: Whats New
3. After the migration, we re-established the switch communication and ran
verification tests. We did not have any problems with these steps.
4. Install new versions of VSD and GPFS and run tests on the file systems. For
VSD, we installed the bos.clvm.enh fileset as a prerequisite.
6.3.2 Migrating PSSP 3.1.1 and AIX 4.3.3 to PSSP 3.5 and AIX 5.1
For this scenario, we chose a system with four 112 MHz SMP high nodes
connected by an SP Switch. AIX 4.3.3 with Maintenance Level 10 and PSSP
3.1.1 are installed on the CWS and on the nodes. We have the first node
configured as an NFS server with a file system that is mounted on the rest of the
nodes.
There is no direct migration path available from the software level we have on this
cluster. Therefore, we decided to move to PSSP 3.4 first on both the CWS and
the nodes, and then to AIX 5L Version 5.1 ML3 and PSSP 3.5. For the first part,
we followed the
PSSP for AIX: Installation and Migration Guide
, GA22-7347 for
PSSP 3.4 and then for PSSP 3.5.
We completed the following steps:
1. Migrate the CWS to PSSP 3.4:
Before migration, check for non-ASCII data in the SDR by running
SDRScan. Verify system configuration and
connectivity
.
After migration, we found that
spmon and spsvrmgr do not give back
information about the nodes. The following examples contain the related
error messages we found. General messages from SPdaemon.log are
shown in Example 6-4 on page 137.
Attention: PSSP 3.1.1 is not supported by IBM. Direct migration from PSSP
3.1.1 to PSSP 3.5 is not supported. We developed this section to show that it
is possible to use the latest level of PSSP software with older SP nodes as
well.
Important: To avoid the following problem in Step 22: Run SbR and
system monitor verification test in Chapter 4 of the
PSSP for AIX:
Installation and Migration Guide
, GA22-7347, reset the Hardware Monitor
daemon by running the
hmreinit command. This is correctly documented
in step 20 of the
PSSP for AIX: Installation and Migration Guide
,
GA22-7347.
Chapter 6. Coexistence, migration, and integration 137
Example 6-4 Error message in /var/adm/SPlogs/SPdaemon.log
Oct 4 13:41:10 sp3cws hardmon[32052]:
LPP=PSSP,Fn=hm_tty.c,SID=1.21.4.15,L#=264, hardmon: 0026-850 Data length
mismatch in packet from tty /dev/tty0 (Frame 1): calculated = 15, received =
14.
Oct 4 13:42:09 sp3cws sphwlog[32106]:
LPP=PSSP,Fn=splogd.c,SID=1.16.7.3,L#=1537, 0026-107 Failure; Frame 1:0;
frPowerModCbad; Power module - DC power loss.
The SDR configuration messages are shown in Example 6-5.
Example 6-5 SDR_init log file: /var/adm/SPlogs/sdr/SDR_config.log
SDR_init: SDR_init was invoked at Fri Oct 4 15:36:04 EDT 2002 with flag values
of debug=0, log=1 and verbose=0.
SDR_init: 0016-082 An error has been encountered while internally executing the
command "/usr/lpp/ssp/bin/hmmon -Q -v type -r 1,5,9,13 2> /dev/null". The
return code from the command was 1. SDR_init is continuing.
SDR_init: 0016-082 An error has been encountered while internally executing the
command "/usr/lpp/ssp/bin/hmmon -Q -v type -r 1,5,9,13 2> /dev/null". The
return code from the command was 1. SDR_init is continuing.
SDR_init: 0016-705 Problem while attempting to read Hardmon data.
SDR_init: 0016-733 SDR_init completed unsuccessfully with a return code value
of 2.
Hardmon-related messages are shown in Example 6-6.
Example 6-6 Hardmon daemon log file /var/adm/SPlogs/spmon/hmlogfile.277
hardmon: 0026-801I Hardware Monitor Daemon started at Fri Oct 4 13:20:31 2002
hardmon: 0026-802I Server port number is 8435, poll rate is 5.000000 seconds
hardmon: 0026-805I 1 frames have been configured.
hardmon: 0026-803I Entered main processing loop 0000001c
hardmon: 0026-808I Received command to quit from SIGTERM at sp3cws/0.
hardmon: 0026-801I Hardware Monitor Daemon started at Fri Oct 4 13:27:39 2002
hardmon: 0026-802I Server port number is 8435, poll rate is 5.000000 seconds
hardmon: 0026-805I 1 frames have been configured.
hardmon: 0026-803I Entered main processing loop 00000038
hardmon: 0026-808I Received command to quit from SIGTERM at sp3cws/0.
hardmon: 0026-850 Data length mismatch in packet from tty /dev/tty0 (Frame 1):
calculated = 15, received = 14.
After restarting the hardmon daemon with the
/usr/lpp/ssp/install/bin/hmreinit command on the CWS, everything
worked fine.

Get IBM eServer Cluster 1600 Managed by PSSP 3.5: What's New now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.