When To Run The Recover System Procedure; Fix Hardware Errors - IBM Storwize V7000 Unified Problem Determination Manual

Table of Contents

Advertisement

When to run the recover system procedure

Attempt a recover procedure only after a complete and thorough investigation of
the cause of the system failure. Attempt to resolve those issues by using other
service procedures.
Attention: If you experience failures at any time while running the recover
system procedure, call the IBM Support Center. Do not attempt to do further
recovery actions, because these actions might prevent support from restoring the
system to an operational status.
Certain conditions must be met before you run the recovery procedure. Use the
following items to help you determine when to run the recovery procedure:
1. Check that no node in the system is active and that the management IP is not
2. Resolve all hardware errors in nodes so that only node errors 578 or 550 are
3. Ensure all backend storage that is administered by the system is present before
4. If any nodes have been replaced, ensure that the WWNN of the replacement

Fix hardware errors

Before running a system recovery procedure, it is important to identify and fix the
root cause of the hardware issues.
Identifying and fixing the root cause can help recover a system, if these are the
faults that are causing the system to fail. The following are common issues which
can be easily resolved:
v The node has been powered off or the power cords were unplugged.
v Check the node status of every node canister that is part of this system. Resolve
382
Storwize V7000 Unified: Problem Determination Guide 2073-720
v Checking your system, for example, to ensure that all mapped volumes can
access the host.
accessible. If any node has active status, it is not necessary to recover the
system.
present. If this is not the case, go to "Fix hardware errors."
you run the recover system procedure.
node matches that of the replaced node, and that no prior system data remains
on this node.
all hardware errors except node error 578 or node error 550.
– All nodes must be reporting either a node error 578 or a node error 550.
These error codes indicate that the system has lost its configuration data. If
any nodes report anything other than these error codes, do not perform a
recovery. You can encounter situations where non-configuration nodes report
other node errors, such as a 550 node error. The 550 error can also indicate
that a node is not able to join a system.
– If any nodes show a node error 550, record the error data that is associated
with the 550 error from the service assistant.
- In addition to the node error 550, the report can show data that is
separated by spaces in one of the following forms:
v Node identifiers in the format: <enclosure_serial>-<canister slot ID>(7
characters, hyphen, 1 number), for example, 01234A6-2
v Quorum drive identifiers in the format: <enclosure_serial>:<drive slot
ID>[<drive 11S serial number>] (7 characters, colon, 1 or 2 numbers,
open square bracket, 22 characters, close square bracket), for example,
01234A9:21[11S1234567890123456789]

Advertisement

Table of Contents
loading

Table of Contents