Netboot Over Hfi Fails; Other Hfi Issues - IBM Power Systems 775 Manual

For aix and linux hpc solution
Table of Contents

Advertisement

4.3.4 netboot over HFI fails

If the SMS ping is successful but rnetboot fails, check the following conditions or try the
following fixes:
Did bootp start on the service node?
Refresh inetd on the service node.
Ensure that the service node and xcatmaster of the node are set correctly.
Check mkdsklsnode for errors.
Confirm that the hfi_net nim object on the service node is set to 'hfi'.
Check that the HFI device drivers are the same as on the compute image.
Verify the Media Access Control (MAC) address.
Check lsnim, /etc/bootptab, and ls -l /tftpboot on the service node.
Is /install set to mount automatically and mounted on service node?
Is /install on a file system other than / on the EMS and service node?
Are any file systems full on the EMS or service node?
Do rpower and lshwconn return the expected values?
Are the directories in /etc/exports mountable?

4.3.5 Other HFI issues

In rare instances, HFI problems might exist that result in scenarios in which link connections
are lost and recovered intermittently, unexplained node or cluster performance issues are
occurring, and so on.
When this type of situation is encountered, it is necessary to open a PMR and gather the
following data for review by the appropriate IBM teams:
From the EMS: Run /usr/bin/cnm.snap and provide the snap file of type snap.tar.gz
created in /var/opt/isnm/cnm/log/.
If applicable, from the affected node, run /usr/hfi/bin/hfi.snap and provide the snap file
of type hfi.snap.tar created in /var/opt/isnm/cnm/log/.
Depending on the nature of the problem, more data might be required, but this data must be
available when IBM support is contacted.
Chapter 4. Troubleshooting problems
263

Advertisement

Table of Contents
loading

Table of Contents