IBM RS/6000 SP Problem Determination Manual page 135

Hide thumbs Also See for RS/6000 SP:
Table of Contents

Advertisement

This soft copy for use by IBM employees only.
In addition to these log files, there is always a possibility that errors will be
recorded in the error report ( errpt ). It is important to note the time of the failure
in order to correlate messages in the error report errpt to the error.
For many problems, looking in the log files does not provide sufficient
information to solve the problem.
In the directory /usr/lpp/ssp/css, there are a few helpful tools, such as:
The format of this file is considerably different between the SP Switch
and the High Performance Switch. This in part reflects the difference
between the two switches in the way they detect and handle switch
faults.
dtbx.trace
This provides traces from switch diagnostics.
dtbx_failed.trace
This is created if any of the switch diagnostics fail. It is basically the
same as the dtbx.trace file, with the addition of error messages. If the
diagnostics run clean, this file will not be created.
daemon.stderr
General error messages. For example, if some nodes could not get
initialized when the
Processing Estart. The following node(s) could not be initialized:
sp21n03
In this case the node in question (sp21n03) was not initialized because its
Worm daemon was not running.
daemon.stdout
This log is a detailed account of the switch initialization process. Most
switch problems do not require analysis. However, in some
circumstances it may prove useful to diagnose the problem.
There are also many normal, informational switch messages that you will
see in the error report. For instance, when you issue an
see a switch fault in the error report.
Expect one of each of these messages for each
ERROR_ID TIMESTAMP T CL Res Name
34FFBE83 0502140496 T H Worm
C3189234 0502135796 T H Worm
css_dump
This will format trace entries relating to the cssdd. To run the command,
css_dump > /tmp/css_dump.out &
issue
entries at the bottom of the output file. This information is helpful in
conditions where the css driver code hangs for unknown reasons. This
command should be run on the primary node and on any of the failing
nodes.
Estart
was issued, you may see the following:
ERROR_Description
HPS Fault - detected by switch chip
HPS Fault - not isolated
. You will find the most recent
Estart
, you will
Estart
command:
115
Chapter 4. The Switch

Advertisement

Table of Contents
loading

Table of Contents