When To Run The Recover System Procedure - IBM Storwize V7000 Maintenance Manual

Table of Contents

Advertisement

When to run the recover system procedure

A recover procedure must be attempted only after a complete and thorough
investigation of the cause of the system failure. Attempt to resolve those issues by
using other service procedures.
Attention: If you experience failures at any time while you are running the
recover system procedure, call the IBM Support Center. Do not attempt to do
further recovery actions because these actions might prevent IBM Support from
restoring the system to an operational status.
Certain conditions must be met before you run the recovery procedure. Use the
following items to help you determine when to run the recovery procedure:
Note: It is important that you know the number of control enclosures in the
system, and when the instructions indicate that every node is checked, you must
check the status of both nodes in every control enclosure. For some system
problems or Fibre Channel network problems, you must run the service assistant
directly on the node to get its status.
v Check to see if any node in the system has a node status of active. This status
v Do not recover the system if the management IP address is available from
v Check the node status of every node canister that is part of this system. Resolve
66
Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide
means that the system is still available. In this case, recovery is not necessary.
another node. Ensure that all service procedures have been run.
all hardware errors except node error 578 or node error 550.
– All nodes must be reporting either a node error 578 or a node error 550.
These error codes indicate that the system has lost its configuration data. If
any nodes report anything other than these error codes, do not perform a
recovery. You can encounter situations where non-configuration nodes report
other node errors, such as a 550 node error. The 550 error can also indicate
that a node is not able to join a system.
– If any nodes show a node error 550, record the error data that is associated
with the 550 error from the service assistant.
- In addition to the node error 550, the report can show data that is
separated by spaces in one of the following forms:
v Node identifiers in the format: <enclosure_serial>-<canister slot ID><7
characters, hyphen, 1 number), for example, 01234A6-2
v Quorum drive identifiers in the format: <enclosure_serial>:<drive slot
ID>[<drive 11S serial number>] (7 characters, colon, 1 or 2 numbers,
open square bracket, 22 characters, close square bracket), for example,
01234A9:21[11S1234567890123456789]
v Quorum MDisk identifier in the format: WWPN/LUN (16 hexadecimal
digits followed by a forward slash and a decimal number), for example,
1234567890123456/12
- If the error data contains a node identifier, ensure that the node that is
referred to by the ID is showing node error 578. If the node is showing a
node error 550, ensure that the two nodes can communicate with each
other. Verify the SAN connectivity, and if the 550 error is still present,
restart one of the two nodes by clicking Restart Node from the service
assistant.
- If the error data contains a quorum drive identifier, locate the enclosure
with the reported serial number. Verify that the enclosure is powered on
and that the drive in the reported slot is powered on and functioning. If the
node canister that is reporting the fault is in the I/O group of the listed

Advertisement

Table of Contents
loading

Table of Contents