Download Print this page

IBM Storwize V7000 Maintenance Manual

Hide thumbs

Advertisement

IBM Storwize V7000
Version 6.3.0
Troubleshooting, Recovery, and
Maintenance Guide
GC27-2291-02

Advertisement

loading

  Summary of Contents for IBM Storwize V7000

  • Page 1 IBM Storwize V7000 Version 6.3.0 Troubleshooting, Recovery, and Maintenance Guide GC27-2291-02...
  • Page 2 Before using this information and the product it supports, read the general information in “Notices” on page 143, the information in the “Safety and environmental notices” on page ix, as well as the information in the IBM Environmental Notices and User Guide on the documentation DVD.
  • Page 3: Table Of Contents

    When to use the service CLI . . 33 Emphasis . . xii Accessing the service CLI. . 33 Storwize V7000 library and related publications . . xii USB key and Initialization tool interface . . 33 How to order IBM publications. . xv When to use the USB key.
  • Page 4 Index ....151 Replacing a 2.5" drive assembly or blank carrier . . 98 Replacing an enclosure end cap . 99 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 5: Figures

    . 14 Removing the 2.5" drive . . 99 SAS ports and LEDs in rear of expansion SAS cable . . 100 enclosure . . 16 Removing a rail assembly from a rack cabinet 108 © Copyright IBM Corp. 2010, 2011...
  • Page 6 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 7: Tables

    10 Gbps Ethernet port LEDs . . 13 Error event IDs and error codes . . 120 SAS port LEDs on the node canister . . 13 Message classification number range Node canister LEDs . . 14 © Copyright IBM Corp. 2010, 2011...
  • Page 8 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 9: Safety And Environmental Notices

    A danger notice indicates the presence of a hazard that has the potential of causing death or serious personal injury. (D002) 2. Locate IBM Storwize V7000 Safety Notices with the user publications that were provided with the Storwize V7000 hardware.
  • Page 10 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 11: About This Guide

    V7000. The chapters that follow introduce you to the hardware components and to the tools that assist you in troubleshooting and servicing the Storwize V7000, such as the management GUI and the service assistant. The troubleshooting procedures can help you analyze failures that occur in a Storwize V7000 system.
  • Page 12: Emphasis

    Storwize V7000. Storwize V7000 Information Center The IBM Storwize V7000 Information Center contains all of the information that is required to install, configure, and manage the Storwize V7000. The information center is updated between Storwize V7000 product releases to provide the most current documentation.
  • Page 13 Storwize V7000 library Unless otherwise noted, the publications in the Storwize V7000 library are available in Adobe portable document format (PDF) from the following website: Support for Storwize V7000 website at www.ibm.com/storage/support/storwize/ v7000 Each of the PDF publications in Table 1 is available in this information center by clicking the number in the “Order number”...
  • Page 14: Storwize V7000 Library

    Machine Code contains the License Agreement for Machine Code for the Storwize V7000 product. Other IBM publications Table 2 on page xv lists IBM publications that contain information related to the Storwize V7000. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 15: How To Order Ibm Publications

    Some publications are available for you to view or download at no charge. You can also order publications. The publications center displays prices in your local currency. You can access the IBM Publications Center through the following website: www.ibm.com/e-business/linkweb/publications/servlet/pbi.wss...
  • Page 16 To submit any comments about this book or any other Storwize V7000 documentation: v Go to the feedback page on the website for the Storwize V7000 Information Center at publib.boulder.ibm.com/infocenter/storwize/ic/index.jsp?topic=/ com.ibm.storwize v7000.doc/feedback.htm. There you can use the feedback page to enter and submit comments or browse to the topic and use the feedback link in the running footer of that page to identify the topic for which you have a comment.
  • Page 17: Chapter 1. Storwize V7000 Hardware Components

    Chapter 1. Storwize V7000 hardware components A Storwize V7000 system consists of one or more machine type 2076 rack-mounted enclosures. There are several model types. The main differences among the model types are the following items: v The number of drives that an enclosure can hold. Drives are located on the front of the enclosure.
  • Page 18: Components In The Front Of The Enclosure

    The LED color is the same for both drives. The LEDs for the 3.5-inch drives are placed vertically above and below each other. The LEDs for the 2.5-inch drives are placed next to each other at the bottom. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 19: Led Indicators On A Single 3.5" Drive

    If the LED is on, a fault exists on the drive. v If the LED is off, no known fault exists on the drive. v If the LED is flashing, the drive is being identified. A fault might or might not exist. Chapter 1. Storwize V7000 hardware components...
  • Page 20: Enclosure End Cap Indicators

    The left enclosure end cap contains no controls or connectors. The right enclosure end cap for both enclosures has no controls, indicators, or connectors. Figure 5. 12 drives and two end caps Figure 6. Left enclosure end cap Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 21: Components In The Rear Of The Enclosure

    2076-324 control enclosure with the 10 Gbps Ethernet port ( 5 ). Figure 9 on page 6 shows the rear of an expansion enclosure. Figure 7. Rear view of a model 2076-112 or a model 2076-124 control enclosure Chapter 1. Storwize V7000 hardware components...
  • Page 22: Power Supply Unit And Battery For The Control Enclosure

    Battery power is required only if both power supply units stop operating. Figure 10 on page 7 shows the location of the LEDs 1 in the rear of the power supply unit. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 23: Power Supply Unit For The Expansion Enclosure

    The two power supply units in the enclosure are installed with one unit top side up and the other inverted. The power supply unit for the expansion enclosure has four LEDs, two less than the power supply for the control enclosure. Chapter 1. Storwize V7000 hardware components...
  • Page 24: Node Canister Ports And Indicators

    Node canister ports and indicators The node canister has indicators and ports but no controls. Fibre Channel ports and indicators The Fibre Channel port LEDs show the speed of the Fibre Channel ports and activity level. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 25: Fibre Channel Ports On The Node Canisters

    LEDs for the Fibre Channel ports on canister 1. Each LED points to the associated port. The first and second LEDs in each set show the speed state, and the third and fourth LEDs show the link state. Figure 13. LEDs on the Fibre Channel ports Chapter 1. Storwize V7000 hardware components...
  • Page 26: Fibre Channel Port Led Status Descriptions

    The WWPNs are derived from the worldwide node name (WWNN) that is allocated to the Storwize V7000 node in which the ports are installed. The WWNN for each node is stored within the enclosure. When you replace a node canister, the WWPNs of the ports do not change.
  • Page 27: Usb Ports On The Node Canisters

    Two LEDs are associated with each port. Note: The reference to the left and right locations applies to canister 1, which is the upper canister. The port locations are inverted for canister 2, which is the lower canister. Chapter 1. Storwize V7000 hardware components...
  • Page 28: Ethernet Ports On The 2076-112 And 2076-124

    Figure 16 shows the location of the 10 Gbps Ethernet ports. Figure 16. 10 Gbps Ethernet ports on the 2076-312 and 2076-324 node canisters Table 11 on page 13 provides a description of the LEDs. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 29: Node Canisters

    Figure 17. SAS ports on the node canisters. SAS ports must be connected to Storwize V7000 enclosures only. See “Problem: SAS cabling not valid” on page 44 for help in attaching the SAS cables. Four LEDs are located with each port. Each LED describes the status of one data channel within the port.
  • Page 30: Leds On The Node Canisters

    It is not able to perform I/O in a system. When the node is in either of these states, it can be removed. Do not remove the canister unless directed by a service procedure. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 31: Expansion Canister Ports And Indicators

    The port locations are inverted for canister 2, which is the lower canister. Expansion canister SAS ports and indicators Two SAS ports are located in the rear of the expansion canister. Chapter 1. Storwize V7000 hardware components...
  • Page 32: Sas Ports And Leds In Rear Of Expansion

    The two LEDs are located in a vertical row on the left side of the canister. Figure 20 on page 17 shows the LEDs ( 1 ) in the rear of the expansion canister. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 33: Leds On The Expansion Canisters

    If the LED is on, a fault exists. v If the LED is off, no fault exists. v If the LED is flashing, the canister is being identified. This status might or might not be a fault. Chapter 1. Storwize V7000 hardware components...
  • Page 34 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 35: Chapter 2. Best Practices For Troubleshooting

    (the default is admin) The control enclosure management IP address Control enclosure service IP address: node canister 1 Control enclosure service IP address: node canister 2 The control enclosure superuser password (the default is passw0rd) © Copyright IBM Corp. 2010, 2011...
  • Page 36: Follow Power Management Procedures

    IBM automatically opens a problem report, and if appropriate, contacts you to verify if replacement parts are required. If you set up Call Home to IBM, ensure that the contact details that you configure are correct and kept up to date as personnel change.
  • Page 37: Back Up Your Data

    IP addresses, are not sent. The inventory email is sent on a regular basis. Based on the information that is received, IBM can inform you if the hardware or software that you are using requires an upgrade because of a known issue.
  • Page 38: Keep Your Records Up To Date

    Know your IBM warranty and maintenance agreement details If you have a warranty or maintenance agreement with IBM, know the details that must be supplied when you call for support. Have the phone number of the support center available. When you call support, provide the machine type (always 2076) and the serial number of the enclosure that has the problem.
  • Page 39: Chapter 3. Understanding The Storwize V7000 Battery Operation For The Control Enclosure

    Note: Storwize V7000 expansion canisters do not cache volume data or store state information in volatile memory. They, therefore, do not require battery power. If ac power to both power supplies in an expansion enclosure fails, the enclosure powers off.
  • Page 40: Maintenance Discharge Cycles

    Important: Although Storwize V7000 is resilient to power failures and brown outs, always install Storwize V7000 in an environment where there is reliable and consistent ac power that meets the Storwize V7000 requirements. Consider uninterruptible power supply units to avoid extended interruptions to data access.
  • Page 41 This condition results in the system entering service state while the one remaining battery performs a maintenance discharge. I/O operations are not permitted during this process. This activity takes approximately 10 hours. Chapter 3. Understanding the Storwize V7000 battery operation for the control enclosure...
  • Page 42 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 43: Chapter 4. Understanding The Medium Errors And Bad Blocks

    Chapter 4. Understanding the medium errors and bad blocks A storage system returns a medium error response to a hose when it is unable to successfully read a block. The Storwize V7000 response to a host read follows this behavior.
  • Page 44 These bad blocks are corrected when the application writes data to these areas. Before the correction happens, the bad block records continue to use up the available bad block space. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 45: Chapter 5. Storwize V7000 User Interfaces For Servicing Your System

    Chapter 5. Storwize V7000 user interfaces for servicing your system Storwize V7000 provides a number of user interfaces to troubleshoot, recover, or maintain your system. The interfaces provide various sets of facilities to help resolve situations that you might encounter. The interfaces for servicing your system connect through the 1 Gbps Ethernet ports that are accessible from port 1 of each canister.
  • Page 46: When To Use The Management Gui

    You must use a supported web browser. Verify that you are using a supported web browser from the following website: Support for Storwize V7000 website at www.ibm.com/storage/support/storwize/ v7000 You can use the management GUI to manage your system as soon as you have created a clustered system.
  • Page 47: Service Assistant Interface

    Use the service assistant in the following situations: v When you cannot access the system from the management GUI and you cannot access the storage Storwize V7000 to run the recommended actions v When the recommended action directs you to use the service assistant.
  • Page 48: Accessing The Service Assistant

    You must use a supported web browser. Verify that you are using a supported and an appropriately configured web browser from the following website: Support for Storwize V7000 website at www.ibm.com/storage/support/storwize/ v7000 To start the application, perform the following steps: 1.
  • Page 49: Cluster (System) Command-Line Interface

    Accessing the cluster (system) CLI Follow the steps that are described in the “Command-line interface” topic in the “Reference” section of the Storwize V7000 Information Center to initialize and use a CLI session. Service command-line interface Use the service command-line interface (CLI) to manage a node canister in a control enclosure using the task commands and information commands.
  • Page 50: When To Use The Usb Key

    Using the initialization tool The initialization tool is a graphical user interface (GUI) application. You must have Microsoft Windows XP Professional or higher to run the application. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 51: Satask.txt Commands

    The physical access to the node canister is required and is used to authenticate the action. Syntax satask chserviceip -serviceip ipv4 -gw ipv4 -mask ipv4 -resetpassword satask chserviceip -serviceip_6 ipv6 -gw_6 ipv6 -prefix_6 int -resetpassword Chapter 5. Storwize V7000 user interfaces for servicing your system...
  • Page 52 Use this command to obtain service assistant access to a node canister even if the current state of the node canister is unknown. The physical access to the node canister is required and is used to authenticate the action. Syntax satask resetpassword Parameters None. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 53 (Optional) Overrides prerequisite checking and forces installation of the software. Description This command copies the file from the USB key to the upgrade directory on the node canister. This command calls the satask installsoftware command. Chapter 5. Storwize V7000 user interfaces for servicing your system...
  • Page 54 Parameters None. Description This command writes the output from each node canister to the USB key. This command calls the sainfo lsservicenodes command, the sainfo lsservicestatus command, and the sainfo lsservicerecommendation command. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 55: Chapter 6. Resolving A Problem

    The management GUI provides extensive facilities to help you troubleshoot and correct problems on your system. You can connect to and manage a Storwize V7000 system as soon as you have created a clustered system. If you cannot create a clustered system, see the problem that contains information about what to do if you cannot create one.
  • Page 56: Problem: Storage System Management Ip Address Unknown

    Consider the following possibilities if you are unable to connect to the management GUI: v You cannot connect if the system is not operational with at least one node online. If you know the service address of a node canister, go to “Procedure: Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 57: Problem: Unable To Log On To The Storage System Management Gui

    Getting node canister and system information using the service assistant” on page 48; otherwise, go to “Procedure: Getting node canister and system information using a USB key” on page 49 and obtain the state of each of the node canisters from the data that is returned. If there is not a node canister with a state of active, resolve the reason why it is not in active state.
  • Page 58: Problem: Unknown Service Address Of A Node Canister

    If you know the service address of any node canister in the system, follow a similar procedure to the one described previously. Rather than using control enclosure management IP address/service to start the service assistant, use the service address that you know. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 59: Problem: Cannot Connect To The Service Assistant

    You cannot connect to the service assistant if the node canister is not able to start the Storwize V7000 code. To verify that the LEDs indicate that the code is active, see “Procedure: Understanding the system status using the LEDs” on page 49.
  • Page 60: Problem: Management Gui Or Service Assistant Does Not Display Correctly

    Support for Storwize V7000 website at www.ibm.com/storage/support/storwize/ v7000 Switch to using a supported web browser. If the problem continues, contact IBM Support. Problem: Node canister location error The node error that is listed on the service assistant home page or in the event log can indicate a location error.
  • Page 61: Problem: New Expansion Enclosure Not Detected

    “Configuring” topic of the information center. There must be a zone that includes all ports from all node canisters. v The existing system and the nodes in the enclosure that are not detected have Storwize V7000 6.2 or later installed. Chapter 6. Resolving a problem...
  • Page 62: Problem: Mirrored Volume Copies No Longer Identical

    If there is a status output for the time the USB key was used, then the satask.txt file was not found. Check that the file was named correctly. The satask.txt file is automatically deleted after it has been processed. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 63: Procedure: Resetting Superuser Password

    The enclosure ID is unique within a Storwize V7000 system. However, if you have more than one Storwize V7000 system, the same ID can be used within more than one system. The serial number is always unique.
  • Page 64: Procedure: Checking The Status Of Your System

    To obtain the information, connect to and log on to the service assistant using the starting service assistant procedure. For more information, go to “Accessing the service assistant” on page 32. 1. Log on to the service assistant. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 65: Procedure: Getting Node Canister And System Information Using A Usb Key

    2. View the information about the node canister that you connected to or the other node canister in the same enclosure or to any other node in the same system that you are able to access over the SAN. Note: If the node that you want to see information about is not the current node, change it to the current node from the home page.
  • Page 66 51 shows the LEDs on the power supply unit for the 2076-112 or 2076-124. The LEDs on the power supply units for the 2076-312 and 2076-324 are similar, but they are not shown here. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 67: Leds On The Power Supply Units Of The Control

    Figure 21. LEDs on the power supply units of the control enclosure Table 18. Power-supply unit LEDs Power supply ac failure dc failure failure Status Action Communication Replace the power failure between supply unit. If failure is the power still present, replace the supply unit and enclosure chassis.
  • Page 68: Power-Supply Unit Leds

    There is no power to the canister. Try reseating the canister. Go to “Procedure: Reseating a node canister” on page 60. If the state persists, follow the hardware replacement procedures for the parts in the following order: node canister, enclosure chassis. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 69: Leds On The Node Canisters

    Table 19. Power LEDs (continued) Power LED status Description Slow Power is available, but the canister is in standby mode. Try to start the node flashing (1 canister by reseating it. Go to “Procedure: Reseating a node canister” on page Fast The canister is running its power-on self-test (POST).
  • Page 70 The battery is either charging or a maintenance discharge is being performed. Nonrecoverable battery fault. Replace the battery. If replacing the battery does not fix the issue, replace the power supply unit. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 71: Procedure: Finding The Status Of The Ethernet Connections

    Table 21. Control enclosure battery LEDs (continued) Battery Good Battery Fault Description Action Flashing Recoverable battery fault. None Flashing Flashing The battery cannot be used None because the firmware for the power supply unit is being downloaded. Procedure: Finding the status of the Ethernet connections This procedure explains how to find the status of the Ethernet connections when you cannot connect.
  • Page 72: Procedure: Removing System Data From A Node Canister

    6. Power each enclosure off and on before creating a system. Procedure: Fixing node errors This procedure describes how to fix a node error that is detected on one of the node canisters in your system. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 73: Procedure: Changing The Service Ip Address Of A Node Canister

    Node errors are reported when there is an error that is detected that affects a specific node canister. 1. Use the service assistant to view the current node errors on any node. 2. If available, use the management GUI to run the recommended action for the alert.
  • Page 74: Procedure: Initializing A Clustered System With A Usb Key Without Using The Initialization Tool

    Check that there were no errors returned by the command. If there is insufficient battery charge to protect the system, the clustered system creates successfully, but it does not start immediately. In the results, look for the Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 75: Procedure: Initializing A Clustered System Using The Service Assistant

    time_to_charge field for the battery. The results provide an estimate of the time, in minutes, before the system can start. If the time is not 0, wait for the required time. Check that the node canister that you inserted the USB key into has its clustered-state LED on permanently.
  • Page 76: Procedure: Reseating A Node Canister

    3. Grasp the handle between the thumb and forefinger. 4. Squeeze them together to release the handle. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 77: Procedure: Powering Off Your System

    10. Verify that the LEDs are on. Procedure: Powering off your system Use this procedure to power off your Storwize V7000 system when it must be serviced or to permit other maintenance actions in your data center. To power off your Storwize V7000 system, use the following steps: 1.
  • Page 78: Procedure: Rescuing Node Canister Software From Another Node (Node Rescue)

    The procedures that are provided here help you solve problems on the Storwize V7000 system and its connection to the storage area network (SAN). SAN failures might cause Storwize V7000 drives to be inaccessible to host systems. Failures can be caused by SAN configuration changes or by hardware failures in SAN components.
  • Page 79: Fibre Channel Link Failures

    1. Ensure that the Fibre Channel cable is securely connected at each end. 2. Replace the Fibre Channel cable. 3. Replace the SFP transceiver for the failing port on the Storwize V7000 node. Note: Storwize V7000 nodes are supported with both longwave SFP transceivers and shortwave SFP transceivers.
  • Page 80 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 81: Chapter 7. Recovery Procedures

    3. Performing actions to get your environment operational v Recovering from offline VDisks (volumes) by using the CLI v Checking your system, for example, to ensure that all mapped volumes can access the host. © Copyright IBM Corp. 2010, 2011...
  • Page 82: When To Run The Recover System Procedure

    Attention: If you experience failures at any time while you are running the recover system procedure, call the IBM Support Center. Do not attempt to do further recovery actions because these actions might prevent IBM Support from restoring the system to an operational status.
  • Page 83 Note: If after resolving all these scenarios, half or greater than half of the nodes are reporting node error 578, it is appropriate to run the recovery procedure. You can also call IBM Support for further assistance. – For any nodes that are reporting a node error 550, ensure that all the missing hardware that is identified by these errors is powered on and connected without faults.
  • Page 84: Fix Hardware Errors

    If any node canisters were modified or replaced, use the service assistant to verify the levels of software, and where necessary, to upgrade or downgrade the level of software. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 85 See “Recovering from offline VDisks using the CLI” on page 70 for details. v T3 failed. Call IBM Support. Do not attempt any further action. The recovery can be run from any node canisters in the system. The node canisters must not have participated in any other system.
  • Page 86: Recovering From Offline Vdisks Using The Cli

    You can perform this task by disconnecting and reconnecting the Fibre Channel cables to each host bus adapter (HBA) port. v Verify that all mapped volumes can be accessed by the hosts. v Run file system consistency checks. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 87: Backing Up And Restoring The System Configuration

    No zoning changes have been made on the Fibre Channel fabric which would prevent communication between the Storwize V7000 and any storage controllers which are present in the configuration. v For configurations with more than one I/O group, if a new system is created on which the configuration data is to be restored, the I/O groups for the other control enclosures must be added.
  • Page 88: Backing Up The System Configuration Using The Cli

    Center for Fabric, VERITAS Volume Manager, and any other programs that record this information. The Storwize V7000 analyzes the backup configuration data file and the system to verify that the required disk controller system nodes are available. Before you begin, hardware recovery must be complete. The following hardware must be operational: hosts, Storwize V7000, drives, the Ethernet network, and the SAN fabric.
  • Page 89 where ssh_private_key_file is the name of the SSH private key file for the superuser and cluster_ip is the IP address or DNS name of the clustered system for which you want to back up the configuration. 4. Issue the following CLI command to remove all of the existing configuration backup and restore files that are located on your configuration node in the /tmp directory.
  • Page 90: Restoring The System Configuration

    1. Verify that all nodes are available as candidate nodes before you run this recovery procedure. You must remove errors 550 or 578 to put the node in candidate state. For all nodes that display these errors, perform the following steps: Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 91 a. Point your browser to the service IP address of one of the nodes, for example, https://node_service_ip_address/service/. b. Log on to the service assistant. c. From the System page, put the node into service state if it is not already in that state.
  • Page 92 Allow a suitable time to elapse and try the svcconfig restore -prepare command again. 13. Issue the following command to copy the log file to another server that is accessible to the system: Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 93: Deleting Backup Configuration Files Using The Cli

    If there are errors, correct the condition that caused the errors and reissue the command. You must correct all errors before you can proceed to step 16. v If you need assistance, contact the IBM Support Center. 16. Issue the following CLI command to restore the configuration:...
  • Page 94 IP address or DNS name of the clustered system from which you want to delete the configuration. 2. Issue the following CLI command to erase all of the files that are stored in the /tmp directory: svconfig clear -all Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 95: Chapter 8. Removing And Replacing Parts

    Preparing to remove and replace parts Before you remove and replace parts, you must be aware of all safety issues. First, read the safety precautions in the IBM Storwize V7000 Safety Notices. These guidelines help you safely work with the Storwize V7000.
  • Page 96: Rear Of Node Canisters That Shows The Handles

    The handle with the finger grip on the left removes the lower canister ( 2 ). Figure 24. Rear of node canisters that shows the handles. 6. Squeeze them together to release the handle. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 97: Replacing An Expansion Canister

    Figure 25. Removing the canister from the enclosure 7. Pull out the handle to its full extension. 8. Grasp canister and pull it out. 9. Insert the new canister into the slot with the handle pointing towards the center of the slot. Insert the unit in the same orientation as the one that you removed.
  • Page 98: Rear Of Expansion Canisters That Shows The

    ( 2 ). Figure 26. Rear of expansion canisters that shows the handles. 5. Squeeze them together to release the handle. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 99: Replacing An Sfp Transceiver

    Figure 27. Removing the canister from the enclosure 6. Pull out the handle to its full extension. 7. Grasp canister and pull it out. 8. Insert the new canister into the slot with the handle pointing towards the center of the slot. Insert the unit in the same orientation as the one that you removed.
  • Page 100: Sfp Transceiver

    Figure 28. SFP transceiver 5. Reconnect the optical cable. 6. Confirm that the error is now fixed. Either mark the error as fixed or restart the node depending on the failure indication that you originally noted. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 101: Replacing A Power Supply Unit For A Control Enclosure

    Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 102 Power supply unit 1 is top side up, and power supply unit 2 is inverted. a. Depress the black locking catch from the side with the colored sticker as shown in Figure 29 on page 87. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 103: Directions For Lifting The Handle On The Power

    Figure 29. Directions for lifting the handle on the power supply unit b. Grip the handle to pull the power supply out of the enclosure as shown in Figure 30. Figure 30. Using the handle to remove a power supply unit 6.
  • Page 104 10. Turn on the power switch to the power supply unit. If required, return the power supply. Follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 105: Replacing A Power Supply Unit For An Expansion Enclosure

    Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 106 Power supply unit 1 is top side up, and power supply unit 2 is inverted. a. Depress the black locking catch from the side with the colored sticker as shown in Figure 31 on page 91. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 107: Directions For Lifting The Handle On The Power

    Figure 31. Directions for lifting the handle on the power supply unit b. Grip the handle to pull the power supply out of the enclosure as shown in Figure 32. Figure 32. Using the handle to remove a power supply unit 6.
  • Page 108 10. Turn on the power switch to the power supply unit. If required, return the power supply. Follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 109: Replacing A Battery In A Power Supply Unit

    Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 110 The battery is a lithium ion battery. To avoid possible explosion, do not burn. Exchange only with the IBM-approved part. Recycle or discard the battery as instructed by local regulations. In the United States, IBM has a process for the collection of this battery. For information, call 1-800-426-4333. Have the IBM part number for the battery unit available when you call.
  • Page 111: Removing The Battery From The Control Enclosure Power-Supply Unit

    Figure 33. Removing the battery from the control enclosure power-supply unit a. Press the catch to release the handle 1 . b. Lift the handle on the battery 2 . c. Lift the battery out of the power supply unit 3 . 4.
  • Page 112: Releasing The Cable Retention Bracket

    To replace the drive assembly or blank carrier, perform the following steps: 1. Read the safety information to which “Preparing to remove and replace parts” on page 79 refers. 2. Unlock the assembly by squeezing together the tabs on the side. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 113: Unlocking The 3.5" Drive

    Figure 34. Unlocking the 3.5" drive 3. Open the handle to the full extension. Figure 35. Removing the 3.5" drive 4. Pull out the drive. 5. Push the new drive back into the slot until the handle starts to move. 6.
  • Page 114: Replacing A 2.5" Drive Assembly Or Blank Carrier

    1. Read the safety information to which “Preparing to remove and replace parts” on page 79 refers. 2. Unlock the module by squeezing together the tabs at the top. Figure 36. Unlocking the 2.5" drive 3. Open the handle to the full extension. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 115: Replacing An Enclosure End Cap

    Figure 37. Removing the 2.5" drive 4. Pull out the drive. 5. Push the new drive back into the slot until the handle starts to move. 6. Finish inserting the drive by closing the handle until the locking catch clicks into place.
  • Page 116: Replacing A Control Enclosure Chassis

    The procedures for replacing a control enclosure chassis are different from those procedures for replacing an expansion enclosure chassis. For information about replacing an expansion enclosure chassis, see “Replacing an expansion enclosure chassis” on page 105. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 117 Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 118 Attention: Perform this procedure only if instructed to do so by a service action or the IBM support center. If you have a single control enclosure, this procedure requires that you shut down your system to replace the control enclosure. If you...
  • Page 119 For each of the canisters, verify the status of the system status LED. If the LED is lit on either of the canisters, do not continue because the system is still online. Determine why the node canisters did not shut down in step 3 on page 102 or step 4 on page 102.
  • Page 120 It is offline and managed. The new enclosure has a new enclosure ID. It is online and unmanaged. 27. Select the original enclosure in the tree view. Verify that it is offline and managed and that the serial number is correct. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 121: Replacing An Expansion Enclosure Chassis

    28. From the Actions menu, select Remove enclosure and confirm the action. The physical hardware has already been removed. You can ignore the messages about removing the hardware. Verify that the original enclosure is no longer listed in the tree view. 29.
  • Page 122 Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 123 Attention: If your system is powered on and performing I/O operations, go the management GUI and follow the fix procedures. Performing the replacement actions without the assistance of the fix procedures can result in loss of data or access to data. Even though many of these procedures are hot-swappable, these procedures are intended to be used only when your system is not up and running and performing I/O operations.
  • Page 124: Replacing The Support Rails

    Figure 39. Removing a rail assembly from a rack cabinet 4. Working from the front of the rack cabinet, remove the clamping screw from the rail assembly on both sides of the rack cabinet. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 125: Storwize V7000 Replaceable Units

    15. Tighten the screw to secure the rail to the rack from the back side. 16. Repeat the steps to secure the opposite rail to the rack cabinet. Storwize V7000 replaceable units TheStorwize V7000 consists of several replaceable units. Generic replaceable units are cables, SFP transceivers, canisters, power supply units, battery assemblies, and enclosure chassis.
  • Page 126 Customer replaced 2.80 m jumper cable 39M5376 Customer replaced 2.8 m power cord (India) 39M5226 Customer replaced 4.3 m power cord (Japan) 39M5200 Customer replaced 2.8 m power cord (Korea) 39M5219 Customer replaced Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 127 Table 22. Replaceable units (continued) Applicable FRU or customer Part Part number models replaced 2.5" SSD, 300 GB, in carrier 85Y5861 124, 224, 324 Customer assembly replaced 2.5" 10 K, 300 GB, in carrier 85Y5862 124, 224, 324 Customer assembly replaced 2.5"...
  • Page 128 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 129: Chapter 9. Event Reporting

    If any service activity is required, a notification is sent. Event reporting process The following methods are used to notify you and the IBM Support Center of a new event: v If you enabled Simple Network Management Protocol (SNMP), an SNMP trap is sent to an SNMP manager that is configured by the customer.
  • Page 130: Describing The Fields In The Event Log

    Event notifications Storwize V7000 can use Simple Network Management Protocol (SNMP) traps, syslog messages, and Call Home email to notify you and the IBM Support Center when significant events are detected. Any combination of these notification methods can be used simultaneously. Notifications are normally sent immediately after an event is raised.
  • Page 131: Power-On Self-Test

    Each event that Storwize V7000 detects is assigned a notification type of Error, Warning, or Information. When you configure notifications, you specify where the notifications should be sent and which notification types are sent to that recipient.
  • Page 132: Understanding The Error Codes

    (FRUs), and the service actions that might be needed to solve the problem. Event IDs The Storwize V7000 software generates events, such as informational events and error events. An event ID or number is associated with the event and indicates the reason for the event.
  • Page 133 Table 25. Informational events (continued) Notification Event ID type Description 980350 The node is now a functional member of the cluster (system). 980351 A noncritical hardware error occurred. 980352 Attempt to automatically recover offline node starting. 980370 Both nodes in the I/O group are available. 980371 One node in the I/O group is unavailable.
  • Page 134 All the expanders on the strand were reset. 984509 The component firmware update paused to allow the battery charging to finish. 984511 The update for the component firmware paused because the system was put into maintenance mode. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 135 All thin-provisioned volume copy data in a node is unpinned. 986010 The thin-provisioned volume copy import has failed and the new volume is offline; either upgrade the Storwize V7000 software to the required version or delete the volume. 986011 The thin-provisioned volume copy import is successful.
  • Page 136: Error Event Ids And Error Codes

    An overnight maintenance procedure has failed to complete. Resolve any hardware and configuration problems that you are experiencing on the cluster (system). If the problem persists, contact your IBM service representative for assistance. 988300 An array MDisk is offline because it has too many missing members.
  • Page 137 Table 26. Error event IDs and error codes (continued) Event Notification Error type Condition code 009173 The FlashCopy feature has exceeded the amount that is 3032 licensed. 009174 The Metro Mirror or Global Mirror feature has exceeded 3032 the amount that is licensed. 009175 The usage for the thin-provisioned volume is not 3033...
  • Page 138 010059 A solid-state drive (SSD) is offline due to excessive 1311 errors. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 139 Table 26. Error event IDs and error codes (continued) Event Notification Error type Condition code 010060 A solid-state drive (SSD) exceeded the warning 1217 temperature threshold. 010061 A solid-state drive (SSD) exceeded the offline 1218 temperature threshold. 010062 A drive exceeded the warning temperature threshold. 1217 010063 Drive medium error.
  • Page 140 1260 045018 A SAS cable was excluded because frames were dropped. 1260 045019 A SAS cable was excluded because the enclosure 1260 discovery timed out. 045020 A SAS cable is not present. 1265 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 141 Table 26. Error event IDs and error codes (continued) Event Notification Error type Condition code 045021 A canister was removed from the system. 1036 045022 A canister has been in a degraded state for too long and 1034 cannot be recovered. 045023 A canister is encountering communication problems.
  • Page 142 The thin-provisioned volume copy is offline because 1865 there is insufficient space. 060002 The thin-provisioned volume copy is offline because the 1862 metadata is corrupt. 060003 The thin-provisioned volume copy is offline because the 1860 repair has failed. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 143 Table 26. Error event IDs and error codes (continued) Event Notification Error type Condition code 062001 Unable to mirror medium error during volume copy 1950 synchronization 062002 The mirrored volume is offline because the data cannot 1870 be synchronized. 062003 The repair process for the mirrored disk has stopped 1600 because there is a difference between the copies.
  • Page 144 073310 A duplicate Fibre Channel frame has been detected, 1203 which indicates that there is an issue with the Fibre Channel fabric. Other Fibre Channel errors might also be generated. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 145 Table 26. Error event IDs and error codes (continued) Event Notification Error type Condition code 074001 Unable to determine the vital product data (VPD) for an 2040 FRU. This is probably because a new FRU has been installed and the software does not recognize that FRU. The cluster (system) continues to operate;...
  • Page 146: Node Error Code Overview

    This topic shows the number range for each message classification. Table 27 lists the number range for each message classification. Table 27. Message classification number range Message classification Range Node errors Critical node errors 500-699 Noncritical node errors 800-899 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 147: Node Errors

    IBM technical support instead: a. You require the volume data on the system from Possible Cause-FRUs or other:...
  • Page 148 1. Review the saved location information of the node chassis procedure. canister and the saved location information of the 3. If this action does not resolve the issue, contact IBM other node canister in the enclosure (the partner technical support. They will work with you to canister).
  • Page 149 IBM technical support for the WWNNs to use. Possible Cause-FRUs or other: Possible Cause-FRUs or other:...
  • Page 150 2. If reseating the canister does not resolve the Note: If you are able to reestablish the systems situation, follow the hardware remove and replace operation you will be able to use the extra node canister procedures to replace the canister. Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 151 WWNN is updated. 6. If you are unable to find a Storwize V7000 node A duplicate WWNN has been detected. canister with the same WWNN as the node canister Explanation: The node canister has detected another...
  • Page 152 This may involve fixing hardware issues on other 1. Follow the procedure to run a node rescue. nodes or fixing connectivity issues between nodes. 2. If the error occurs again, contact IBM technical 3. If you are able to reestablish the cluster, remove the support.
  • Page 153 671 • 690 4. If all nodes have either node error 578 or 550, Possible Cause-FRUs or other: follow the cluster recovery procedures. v None 5. Attempt to determine what caused the nodes to User response: shut down. 1. Wait for the node to automatically fix the error when sufficient charge becomes available.
  • Page 154 If the specified checks do not a boot failure has occurred, or the PCIe link is broken. resolve the problem, and you have not created the cluster yet, continue with cluster creation on the Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 155: Cluster Recovery And States

    2. Reconfigure your SAN zoning so that only the 1. Determine status of other node. Storwize V7000 nodes, the host system ports, and the storage system ports to which you want to 2. Restart or replace the node if it has failed (should connect are visible to the node that is reporting the be node error on partner).
  • Page 156 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 157: Appendix. Accessibility

    – Press Enter to launch the action. v For filter panes: – Press Tab to navigate to the filter panes. – Press the Up or Down Arrow keys to change the filter or navigation for nonselection. © Copyright IBM Corp. 2010, 2011...
  • Page 158 – Press Tab to navigate to the fields that are available for editing. – Type your edit and press Enter to issue the change command. Accessing the publications You can find the HTML version of the IBM Storwize V7000 information at the following website: publib.boulder.ibm.com/infocenter/storwize/ic/index.jsp You can access this information using screen-reader software and a digital speech synthesizer to hear what is displayed on the screen.
  • Page 159: Notices

    Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead.
  • Page 160 IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created...
  • Page 161: Trademarks

    IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs.
  • Page 162: Industry Canada Compliance Statement

    Properly shielded and grounded cables and connectors must be used in order to meet FCC emission limits. IBM is not responsible for any radio or television interference caused by using other than recommended cables and connectors, or by unauthorized changes or modifications to this equipment.
  • Page 163: Germany Electromagnetic Compatibility Directive

    Klasse A ein. Um dieses sicherzustellen, sind die Geräte wie in den Handbüchern beschrieben zu installieren und zu betreiben. Des Weiteren dürfen auch nur von der IBM empfohlene Kabel angeschlossen werden. IBM übernimmt keine Verantwortung für die Einhaltung der Schutzanforderungen, wenn das Produkt ohne Zustimmung der IBM verändert bzw.
  • Page 164: Japan Vcci Council Class A Statement

    This apparatus is manufactured to the International Safety Standard EN60950 and as such is approved in the U.K. under approval number NS/G/1234/J/100003 for indirect connection to public telecommunications systems in the United Kingdom. Korean Communications Commission (KCC) Class A Statement Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 165: Russia Electromagnetic Interference (Emi) Class A Statement

    Fax: 0049 (0)711 785 1283 Email: mailto: tjahn @ de.ibm.com Taiwan Contact Information This topic contains the product service contact information for Taiwan. IBM Taiwan Product Service Contact Information: IBM Taiwan Corporation 3F, No 7, Song Ren Rd., Taipei Taiwan Tel: 0800-016-888...
  • Page 166 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 167: Index

    China 148 indicators 4 location information 22 enclosure hardware 1 subscribe components 5 contact information notifications 22 identification 47 European 149 troubleshooting 19 enclosure end cap Taiwan 149 warranty agreement replacing 99 maintenance agreement 22 © Copyright IBM Corp. 2010, 2011...
  • Page 168 16 status 55 rear Fibre Channel ports 9 Fibre Channel ports 9 node canister 14 errors 68 parts rear-panel indicators 9 fixing removing system status 49 node errors 57 overview 79 preparing 79 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 169 62 restore 65 power-on self-test 115 reseating servicing 63 powering off node canister 60 Storwize V7000 library system 61 reset service assistant password 36 related publications xii problem reset service IP address 35 summary of changes xi, xii...
  • Page 170 CLI 70 warranty agreement best practices 22 when to use cluster (system) CLI 33 management GUI interface 30 service assistant 31 service CLI 33 USB key 34 worldwide port names (WWPNs) description 10 Storwize V7000: Troubleshooting, Recovery, and Maintenance Guide...
  • Page 172 Printed in USA GC27-2291-02...