Link Troubleshooting; Fault Isolation; Identifying The Origin Of Failure - IBM System Storage SAN384B-2 Installation, Service And User Manual

Hide thumbs Also See for System Storage SAN384B-2:
Table of Contents

Advertisement

Link troubleshooting

|
|
|
|
|
|
|
|
|
|
|
|
|
|

Fault isolation

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
140
SAN384B-2 Installation, Service, and User Guide
IBM SAN b-type directors and switches use the latest high bandwidth Fibre
Channel technology and auto-negotiate to 16 Gbps, 8 Gbps, 4 Gbps, or 2 Gbps
based on the link data rate capability of the attached transceiver and the speed
supported by the switches and directors. Negotiation to 1 Gbps is not supported
unless 4 Gbps FC transceivers are used. As the 8 and 16 Gbps channel is more
sensitive to the condition of the existing multimode and single mode cable plant, it
is very important to minimize connector reflections and maintain an acceptable
link loss budget.
This section provides link troubleshooting advice on fault isolation and provides
guidance in the areas:
v Dust and dirt contamination
v Link loss
v Attenuation on LWL connections
Since a job loss issue can be caused by a variety of problems, it is important to
employ a systematic fault isolation process to remedy the issue. Note that job
losses do not necessarily result from link errors. They may also be due to:
v Configuration issues
v Networking overload
v Failures on storage device, switch, or server
Assume for these procedures that the observed errors originate from link errors
and are not the result of configuration issues, network overload or network
equipment failures.
Whenever CRC errors are discovered on a particular link, it is easy to jump to the
conclusion that the link is causing the network issue. This might not be the case.
Since CRC errors are just symptoms of a link issue, we need to trace the
propagated error to where it originated.
Figure 52 shows a simplified network involving a server, a switch, and a storage
device. In this example, assume that the server experienced an error at port 1. This
observable error can potentially originate from links 1, 2, 3 or 4 and/or SFP 1, 2, 3
or 4.
Server
Link 1
Tx
Link 4
Rx
Figure 52. Identifying the origin of failure
To determine the original failing link, the observable CRC error needs to be
tracked back to the first occurrence of the CRC error. By this process, it is
discovered in this example that CRC errors observed in link 4 were propagated
from link 3, which in turn originated from link 2.
Switch
Link 2
Rx
Tx
Link 3
Tx
Rx
Storage
Rx
Tx

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents