Collecting Diagnostic Data - IBM Power AC922 8335-GTW Handbook

Problem analysis, system parts, and locations
Table of Contents

Advertisement

Table 10. Determining a verification action for GPUs, PCIe adapters, and devices
Adapter type
Devices that are not controlled by a RAID adapter Complete the following steps:
GPU
Network adapter

Collecting diagnostic data

Learn how to collect diagnostic data to send to IBM service and support.
About this task
To collect diagnostic data, complete the following steps:
Procedure
1. Is the operating system available?
If
Yes:
No:
2. To collect diagnostic data from the operating system, complete the following steps:
Then
Continue with step "2" on page 23.
Continue with step "3" on page 24.
Verification action
a. Install the smartmontools utility. If you have
the Red Hat Enterprise Linux operating
system, type yum install
smartmontools at the command prompt of
the operating system and press Enter. If you
have the Ubuntu Linux operating system, type
apt-get install smartmontools at the
command prompt of the operating system and
press Enter.
b. At the command prompt of the operating
system, type smartctl --all /dev/sdx,
where x is the letter that is associated with the
drive.
c. Verify that the SMART health assessment
passed.
Complete the following steps:
a. Type nvidia-smi -L at the command
prompt of the operating system and press
Enter. Verify that the GPU is listed.
b. Type nvidia-smi -q at the command
prompt of the operating system and press
Enter. Verify that no errors are listed.
Complete the following steps:
a. At the command prompt of the operating
system, type ethtool ethx, where x is the
number of the physical port that you are
testing. Verify that the connection speed that
is indicated in the output is correct.
b. Perform a ping test to verify the network
connectivity.
Beginning troubleshooting and problem analysis 23

Advertisement

Table of Contents
loading

Table of Contents