IBM Storwize V7000 Unified Problem Determination Manual

IBM Storwize V7000 Unified Problem Determination Manual

Table of Contents

Advertisement

Quick Links

IBM Storwize V7000 Unified
Problem Determination Guide
GA32-1057-07

Advertisement

Table of Contents
loading

Summary of Contents for IBM Storwize V7000 Unified

  • Page 1 IBM Storwize V7000 Unified Problem Determination Guide GA32-1057-07...
  • Page 2 Before using this information and the product it supports, read the general information in “Notices” on page 309, the information in the “Safety and environmental notices” on page xi, as well as the information in the IBM Environmental Notices and User Guide , which is provided on a DVD.
  • Page 3: Table Of Contents

    . . 55 Emphasis . . xix Removing and replacing file module components 58 Storwize V7000 Unified library and related Resolving hard disk drive problems . . 61 publications . . xx Monitoring memory usage on a file module .
  • Page 4 Working with NFS clients that fail to mount Procedure: Fixing node errors . . 220 NFS shares after a client IP change . . 275 Procedure: Changing the service IP address of a node canister . 220 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 5 Working with file modules that report a stale Appendix. Accessibility features for NFS file handle. . 276 IBM Storwize V7000 Unified ..307 File module-related issues . . 277 Restoring System x firmware (BIOS) settings Notices .
  • Page 6 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 7: Figures

    ServeRAID M1000 advanced feature key and SAS cable . . 246 M1015 adapter . . 131 Removing a rail assembly from a rack cabinet 255 ServeRAID M5000 advanced feature key and M5014 adapter . . 132 © Copyright IBM Corp.
  • Page 8 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 9: Tables

    Error code port location mapping . . 36 installed . . 145 Fibre Channel cabling from the file module to Storwize V7000 Unified logical devices and the control enclosure. . . 37 physical port locations . 165 LED states and associated actions. For the Hostname and service IP reference .
  • Page 10 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 11: Safety And Environmental Notices

    DANGER A danger notice indicates the presence of a hazard that has the potential of causing death or serious personal injury. (D002) 2. Locate IBM Systems Safety Notices with the user publications that were provided ® with the Storwize V7000 Unified hardware.
  • Page 12 Les sikkerhetsinformasjonen (Safety Information) før du installerer dette produktet. Antes de instalar este produto, leia as Informações sobre Segurança. Antes de instalar este producto, lea la información de seguridad. Läs säkerhetsinformationen innan du installerar den här produkten. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 13: Safety Statements

    Safety statements Each caution and danger statement in this document is labeled with a number. This number is used to cross reference an English-language caution or danger statement with translated versions of the caution or danger statement in the Safety Information document.
  • Page 14 Statement 2 CAUTION: When replacing the lithium battery, use only IBM Part Number 33F8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer.
  • Page 15 DANGER Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following. Laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam. Class 1 Laser Product Laser Klasse 1 Laser Klass 1...
  • Page 16 240 V under any distribution fault condition. Important: This product is not suitable for use with visual display workplace devices according to Clause 2 of the German Ordinance for Work with Visual Display Units. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 17: Sound Pressure

    Sound pressure Attention: Depending on local conditions, the sound pressure can exceed 85 dB(A) during service operations. In such cases, wear appropriate hearing protection. xvii Safety and environmental notices...
  • Page 18 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 19: About This Guide

    V7000 Unified. The chapters that follow introduce you to the hardware components and to the tools that assist you in troubleshooting and servicing the Storwize V7000 Unified, such as the management GUI and the service assistant. The troubleshooting procedures can help you analyze failures that occur in a Storwize V7000 Unified system.
  • Page 20: Storwize V7000 Unified Library And Related Publications

    Storwize V7000 Unified library Unless otherwise noted, the publications in the Storwize V7000 Unified library are available in Adobe portable document format (PDF) from the following website: www.ibm.com/storage/support/storwize/v7000/unified Each of the PDF publications in Table 1 is available in this information center by clicking the number in the “Order number”...
  • Page 21 SC28-6872 (contains Machine Code contains the License Z125-5468) Agreement for Machine Code for the Storwize V7000 Unified product. Other IBM publications Table 2 on page xxii lists IBM publications that contain information related to the Storwize V7000 Unified. About this guide...
  • Page 22: How To Order Ibm Publications

    Some publications are available for you to view or download at no charge. You can also order publications. The publications center displays prices in your local currency. You can access the IBM Publications Center through the following website: www.ibm.com/e-business/linkweb/publications/servlet/pbi.wss...
  • Page 23: Sending Your Comments

    To submit any comments about this book or any other Storwize V7000 Unified documentation: v Go to the feedback page on the website for the Storwize V7000 Unified Information Center at publib.boulder.ibm.com/infocenter/storwize/unified_ic/ index.jsp?topic=/com.ibm.storwize.v7000.unified.doc/feedback_ifs.htm. There you can use the feedback page to enter and submit comments or browse to the topic and use the feedback link in the running footer of that page to identify the topic for which you have a comment.
  • Page 24 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 25: Chapter 1. Storwize V7000 Unified Hardware Components

    Chapter 1. Storwize V7000 Unified hardware components A Storwize V7000 Unified system consists of one or more machine type 2076 rack-mounted enclosures and two machine type 2073 rack-mounted file modules. There are several model types for the 2076 machine type. The main differences among the model types are the following items: v The number of drives that an enclosure can hold.
  • Page 26 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 27: Chapter 2. Best Practices For Troubleshooting

    Use this address if the control enclosure CLI is not working. These addresses are not set during the installation of a Storwize V7000 Unified system, but you can set these IP addresses later by using the chserviceip CLI command.
  • Page 28: Follow Power Management Procedures

    RAID arrays for the disk system. The Storwize V7000 Unified system uses a pair of file modules for redundancy. Follow the appropriate power down procedures to minimize impacts to the system operations.
  • Page 29: Back Up Your Data

    IBM automatically opens a problem report, and if appropriate, contacts you to verify if replacement parts are required. If you set up Call Home to IBM, ensure that the contact details that you configure are correct and kept up to date as personnel change.
  • Page 30: Keep Your Software Up To Date

    Know your IBM warranty and maintenance agreement details If you have a warranty or maintenance agreement with IBM, know the details that must be supplied when you call for support. Have the phone number of the support center available. When you call support, provide the machine type and the serial number of the enclosure or file module that has the problem.
  • Page 31 Support personnel also ask for your customer number, machine location, contact details, and the details of the problem. Chapter 2. Best practices for troubleshooting...
  • Page 32 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 33: Chapter 3. Getting Started Troubleshooting

    If users or applications are having trouble accessing data that is held on the Storwize V7000 Unified system, or if the management GUI is not accessible or is running slowly, the Storwize V7000 control enclosure might have a problem.
  • Page 34: Installation Troubleshooting

    169; otherwise, see “Checking the GPFS file system mount on each file module” on page 171. If you have lost access to the files, but there is no sign that anything is wrong with the Storwize V7000 Unified system, see “Host to file modules connectivity” on page 25. Installation troubleshooting This topic provides information for troubleshooting problems encountered during the installation.
  • Page 35 – Product Family: Disk Systems – Product: IBM Storwize V7000 Unified – Release: All – Platform: All Before loading the USB flash drive verify it has a FAT32 formatted file system. Plug the USB flash drive into the laptop. Go to Start (my computer), right-click the USB drive.
  • Page 36: Installed

    SONAS_results.txt file and open it. Check for errors and corrective actions (refer to Storwize V7000 Unified Problem Determination Guide PDF on the CD). If no errors are listed, reboot both file modules, allow file modules to boot completely, reinsert the USB flash drive as originally instructed and try again.
  • Page 37: Installation Error Codes

    3. Refer to Table 5 to match the code (A-H) to the recommended action. Follow the suggested action, in order, completing one before trying the next. 4. If the recommended action or actions fail, call the IBM Support Center. Table actions defined This table serves as a legend for defining the precise action to follow.
  • Page 38 Verify that the Ethernet cabling connections are seated properly between the Storwize V7000 Unified control enclosure and the customer network, as well as the file modules cabling to the customer network. Then reinsert the USB flash drive into the original file module.
  • Page 39 0AAF Unable to get node roles from VPD. 0AB0 Error opening /etc/sysconfig/rsyslog. 0AB1 Error writing to /etc/sysconfig/rsyslog. 0AB2 Error reading /etc/rsyslog.conf. 0AB3 Unable to open /opt/IBM/sonas/etc/ rsyslog_template_mgmt.conf. 0AB4 Unable to open /opt/IBM/sonas/etc/ rsyslog_template_int.conf. 0AB5 Unable to open /opt/IBM/sonas/etc/ rsyslog_template_strg.conf. 0AB6 Unknown node roles.
  • Page 40 Trying to install management stack on non-management node. 0AF9 Invalid site ID. Curently only 'st001' is supported on physical systems. 0AFA This node is already a part of a cluster. Unable to configure. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 41 Table 6. Error messages and actions (continued) Error code Error message Action key 0AFB Unable to generate public/private keys. 0AFC Unable to copy user SSH keys. 0AFD Unable to copy host SSH keys. 0AFE Unable to set the system's timezone. 0AFF Unable to write clock file.
  • Page 42 Storage controllers may be cabled incorrectly or UUIDs might not be set properly. 0B95 Invalid parameters. 0B96 Failed to configure the management processes on D then A then B mgmt001st001 0B97 IP is invalid. 0B98 Netmask is invalid. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 43 Table 6. Error messages and actions (continued) Error code Error message Action key 0B99 IP, gateway, and netmask are not a valid combination. 0B9A There was an internal error. 0B9B Invalid NAS private key file. 0B9C Unable to copy the NAS private key file. 0B9D Internal error setting permissions on NAS private key file.
  • Page 44: Problems Reported By The Cli Commands During Software Configuration

    Use this information when troubleshooting problems reported by the CLI commands during software configurations. The following table contains error messages that might be displayed when running the CLI commands during software configuration. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 45: Management Gui Wizard Failure

    Table 7. CLI command problems CLI Command Symptom/Message Action mkfs SG0002C Command This message indicates that the exception found : Disk arrays listed in the error message <arrayname> might still appear to already be part of a file belong to file system system.
  • Page 46: Gui Access Issues

    1. Does the GUI launch and are there problems logging into the system? v Yes: Check that the user ID being used was set up to access the GUI. Refer to “Authentication basic concepts” in the IBM Storwize V7000 Unified Information Center.
  • Page 47: Health Status And Recovery

    – Yes: Run the CLI command lshealth. Reference the active management node Hostname (mgmt001st001 or mgmt002st002) obtained from the lsnode command. Ensure that HOST_STATE, SERVICE, and NETWORK from lshealth is set to OK. Sample Output: mgmt001st001 HOST_STATE SERVICE All services are running OK CTDB CTDBSTATE_STATE_ACTIVE GPFS...
  • Page 48 About this task Within the Storwize V7000 Unified system, the system Health Status is based on a set of predefined software and hardware health status sensors that are reflected in the System Details page under the Status section for the corresponding logical host name.
  • Page 49: Connectivity Issues

    a. Review the Sensor column and the Level column for Critical Error, Major Warning, or Minor Warning items. If the problem that caused the Level item is resolved, right-click the event and select the Mark Event as Resolved action. b. Follow the online instructions to complete the change. c.
  • Page 50: Ethernet Connectivity Between File Modules

    108 and “Installing a PCI adapter in a PCI riser-card assembly” on page 109. Ethernet connectivity between file modules This topic covers troubleshooting Ethernet connectivity issues between the file modules. These connections are used for internal management operations between Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 51: File Module Ethernet Direct Connections

    They make use of the Internal IP address range that you provided during initializing the Storwize V7000 Unified system. About this task This procedure is used to troubleshoot Ethernet connectivity between the file modules. These network paths are used for all internal file system communication.
  • Page 52 It is always possible that somebody in your site could set up another machine to use one or more IP address that your Storwize V7000 Unified system is already using. Use the management GUI to check which four IP addresses the file modules are currently using to communicate with each other.
  • Page 53: Ethernet Connectivity From File Modules To The Control Enclosure

    If you cannot stop other machines on your network using these IP addresses and must change the internal IP address range used then you need to contact IBM Remote Technical Support to help you to put your file modules back to an out-of-box state so you can choose a different internal IP address range.
  • Page 54 USB flash drive to discover the state and settings of the Storwize V7000. Make sure that there is no satask.txt file on the USB flash drive before you plug it into the control enclosure. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 55 CLI command). Otherwise you may have plugged the USB flash drive into the wrong control enclosure (such as one that is not part of this Storwize V7000 unified system). The node_status should be active for each node canister in the cluster under sainfo lsservicestatus. Otherwise follow the service action under sainfo lsservicerecommendation.
  • Page 56 Update the file module's record of the control enclosure system IP: To find the file module's current record of the control enclosure system IP address, use the Storwize V7000 Unified management CLI to issue the lsstoragesystem command. Here is an example: >ssh admin@<management_IP>...
  • Page 57 Verify that communication from the file module to the control enclosure is now possible by running the lssystemip command on the Storwize V7000 Unified management CLI: >ssh admin@<managment IP address> [kd01ghf.ibm]$ lssystemip Changing the cluster IP of the file modules: If the cluster IP address of the file modules is not known, or has been incorrectly set, the value can be changed by logging into the system using a console.
  • Page 58: Fibre Channel Connectivity Between File Modules And Control Enclosure

    Each file module has a dual port Fibre Channel adapter card located in PCI slot 2. Both ports are used to connect to the Storwize V7000 control enclosure with a connection going to each control canister. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 59: Diagram Shows How To Connect The File Modules To The Control Enclosure Using Fibre Channel Cables. (A) Is File Module 1 And (B) Is File Module 2. (C) Is The Control Enclosure

    CAUTIO N CAUT I O N Disconnect all Disconnect all supply power for supply power for complete isolation complete isolation Figure 3. Diagram shows how to connect the file modules to the control enclosure using Fibre Channel cables. (A) is file module 1 and (B) is file module 2. (C) is the control enclosure.
  • Page 60: Error Code Port Location Mapping

    2. Run the command: locatenode #HOSTNAME on #SECONDS. HOSTNAME is the hostname associated with the error... either mgmt001st001or mgmt002st001. #SECONDS is the number of seconds for the LED indicator to be turned on. Physical connection and repair: Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 61: Fibre Channel Cabling From The File Module To The Control Enclosure

    Each file module has a dual port Fibre Channel adapter card located in PCI slot 2. Both ports are used to connect to the Storwize V7000 system with a connection going to each Storwize V7000 node canister. Table 12. Fibre Channel cabling from the file module to the control enclosure. File Module Node # 1 File Module Storage Node # 2 PCI slot #2, port 1...
  • Page 62: Understanding Led Hardware Indicators

    Hard disk drive status LED (amber) Rack Rack release release latch latch Bay 0 Bay 11 CD/DVD drive CD/DVD drive Hard disk CD/DVD activity LED (optical drive) drive bays eject button Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 63 2. To view the light path diagnostics panel, slide the latch to the left on the front of the operator information panel and pull the panel forward. This reveals the light path diagnostics panel. Lit LEDs on this panel indicate the type of error that has occurred.
  • Page 64 12v channel error LEDs indicate an overcurrent condition. Refer to the procedure “Solving power problems” in the “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center to identify the components that are associated with each power channel, and the order in which to troubleshoot the components.
  • Page 65 Light path diagnostics LEDs LEDs on the light path diagnostics panel of an Storwize V7000 Unified file module indicate the cause of a problem. About this task Table 15 shows suggested actions to correct detected problems. Note: Check the system-event log for additional information before you replace a FRU.
  • Page 66: Led Indicators, Corresponding Problem Causes, And Corrective Actions

    LINK Reserved. An error message has been written to Check the system logs for information about the error. Replace the system-event log any components that are identified in the error logs. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 67 The power on “Power problems” in the appropriate server guide in supplies are using more power than “Troubleshooting the System x3650” in the IBM Storwize their maximum rating. V7000 Unified Information Center. (For the location of power channel error LEDs, see the section on “Internal...
  • Page 68 One power supply v Power cord v Three cooling fans v One PCI riser-card assembly in PCI riser connector 2 The following illustration shows the locations of the power-supply LEDs. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 69 Refer to “Removing and replacing parts” on page 85 to determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU). v Go to the IBM support website at www.ibm.com/storage/support/storwize/v7000/unified to check for technical information, hints, tips, and new device drivers, or to submit a request for information.
  • Page 70: Enclosure Hardware Indicators

    Refer to “Removing and replacing parts” on page 85 to determine which components are customer replaceable units (CRU) and which components are field replaceable units (FRU). v Go to the IBM support website at www.ibm.com/storage/support/storwize/v7000/unified to check for technical information, hints, tips, and new device drivers, or to submit a request for information.
  • Page 71: Leds On The Power Supply Units Of The Control

    Figure 4. LEDs on the power supply units of the control enclosure Table 17. Power-supply unit LEDs Power supply ac failure dc failure failure Status Action Communication Replace the power failure between supply unit. If failure is the power still present, replace the supply unit and enclosure chassis.
  • Page 72: Power-Supply Unit Leds

    LEDs also flash. Table 18 on page 49 shows the three canister status LEDs on each of the node canisters. Figure 5 on page 49 shows the LEDs on the node canister. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 73: Leds On The Node Canisters

    Figure 5. LEDs on the node canisters Table 18. Power LEDs Power LED status Description There is no power to the canister. Try reseating the canister. Go to “Procedure: Reseating a node canister” on page 222. If the state persists, follow the hardware replacement procedures for the parts in the following order: node canister, enclosure chassis.
  • Page 74 Battery Good Battery Fault Description Action Battery is good and fully None charged. Flashing Battery is good but not fully None charged. The battery is either charging or a maintenance discharge is being performed. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 75: Management Gui Interface

    Table 20. Control enclosure battery LEDs (continued) Battery Good Battery Fault Description Action Nonrecoverable battery fault. Replace the battery. If replacing the battery does not fix the issue, replace the power supply unit. Flashing Recoverable battery fault. None Flashing Flashing The battery cannot be used None because the firmware for the...
  • Page 76: When To Use The Management Gui

    GUI to resolve the problem. Always use the fix procedures for both system configuration problems and hardware failures. The fix procedures analyze the system to ensure that the required changes do not cause volumes to be Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 77: Accessing The Storwize V7000 Unified Management Gui

    You can use fix procedures to diagnose and resolve problems with the Storwize V7000 Unified. About this task For example, to repair a Storwize V7000 Unified system, you might perform the following tasks: v Analyze the event log v Replace failed components...
  • Page 78 Many of the file module fix procedures are not automated. In these cases, you are directed to a documented procedure in the Storwize V7000 Unified Information Center. The example uses the management GUI to repair a Storwize V7000 Unified system. Perform the following steps to start the fix procedure: Procedure 1.
  • Page 79: Chapter 4. File Module

    3. The node reboot restarts all services that were previously running. Removing a file module to perform a maintenance action You can remove an IBM Storwize V7000 Unified file module to perform maintenance. The procedure that you follow differs slightly, depending on whether you must unplug the power cables.
  • Page 80 Removing a file module and disconnecting power You can remove an IBM Storwize V7000 Unified file module and disconnect it from its power line cords before performing a maintenance action that requires the file module to have no power.
  • Page 81 To remove the mgmt001st001 file module from the system, for example, issue the following command: # suspendnode mgmt001st001 3. Wait for the Storwize V7000 Unified system to stop the file module at the clustered trivial database (CTDB) level. The command does not unmount any mounted file systems.
  • Page 82: Removing And Replacing File Module Components

    FRUs must be installed by trained service technicians. About this task Installation guidelines To help you work safely with IBM Storwize V7000 Unified file modules, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and these guidelines.
  • Page 83 v Do not attempt to lift an object that you think is too heavy for you. If you have to lift a heavy object, observe the following precautions: – Make sure that you can stand safely without slipping. – Distribute the weight of the object equally between your feet. –...
  • Page 84 Take additional care when handling devices during cold weather. Heating reduces indoor humidity and increases static electricity. Returning a device or component When returning a device or component, follow all packaging instructions and use any supplied packaging materials for shipping. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 85: Resolving Hard Disk Drive Problems

    Resolving hard disk drive problems Use this information to address various hard disk drive issues. About this task v Before running a procedure, refer to “Removing a file module to perform a maintenance action” on page 55. v Follow the suggested actions for a Symptom in the order in which they are listed in the Action column until the problem is solved.
  • Page 86 Turn on the server and observe the activity of the hard disk drive LEDs. Displaying node mirror and hard drive status The Storwize V7000 Unified system provides a method to check the node mirror status and hard drive status for each file module.
  • Page 87: Selecting A File Module To Display Node Status

    1. Ensure that you are logged into the file module as root. 2. To display mirror status and hard drive status, run the following perl script: # /opt/IBM/sonas/bin/cnrspromptnode.pl -a -c "/opt/IBM/sonas/bin/cnrsQueryNodeDrives.pl" File modules in this Storwize V7000 Unified Cluster Node Node Name Node Details -------------------------------------------------------------------------------- 1.
  • Page 88: Displaying Node Status

    The volume is Active. The user data is not fully protected due to a configuration change or drive failure. Rebuilding (RBLD) A data resynchronization or rebuild might be in progress. or Resyncing (RSY) Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 89: Re-Synchronizing

    Table 21. Status of volume (continued) Status of volume Description Inactive, Okay The volume is inactive and the drives are functioning correctly. The (OKY) user data is protected if the current RAID level is RAID 1 (IM) or RAID 1E (IME). Inactive, Degraded The volume is inactive and the user data is not fully protected due (DGD)
  • Page 90: Example That Shows That Mirroring Is

    SMART ASCQ : none Figure 8. Example that shows that mirroring is re-synchronizing If a drive were not synchronized, the status might appear like the status shown in Figure 9 on page 67: Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 91: Example That Shows That A Drive Is Not Synchronized

    The mirror is not created/configured. If the mirror is not created, refer to “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center for information on launching the LSI configuration tool. Chapter 4. File module...
  • Page 92: Example That Shows That The Mirror Is Not Created

    ASC/ ASCQ error of 05/00. For isolation and the repair of hard disk problems, refer to “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center. For a list of SMART (ASC/ASCQ) error codes and their descriptions, go to “SMART ASC/ASCQ error codes and messages”...
  • Page 93 Device is a Hard disk Enclosure # Slot # Connector ID Target ID State : Online (ONL) Size (in MB)/(in sectors) : 286102/585937500 Manufacturer : IBM-ESXS Model Number : MBD2300RC Firmware Revision : SB19 Serial No : D009P9A01SJC Drive Type : SAS Protocol...
  • Page 94: Smart Asc/Ascq Error Codes And Messages

    NO REFERENCE POSITION FOUND MULTIPLE PERIPHERAL DEVICES SELECTED LOGICAL UNIT COMMUNICATION FAILURE LOGICAL UNIT COMMUNICATION TIME-OUT LOGICAL UNIT COMMUNICATION PARITY ERROR LOGICAL UNIT COMMUNICATION CRC ERROR (ULTRA-DMA/32) UNREACHABLE COPY TARGET TRACK FOLLOWING ERROR Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 95 Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description HEAD SELECT FAULT ERROR LOG OVERFLOW WARNING WARNING - SPECIFIED TEMPERATURE EXCEEDED WARNING - ENCLOSURE DEGRADED WARNING - BACKGROUND SELF-TEST FAILED WARNING - BACKGROUND PRE-SCAN DETECTED MEDIUM ERROR WARNING - BACKGROUND MEDIUM SCAN DETECTED MEDIUM ERROR WARNING - NON-VOLATILE CACHE NOW VOLATILE WARNING - DEGRADED POWER TO NON-VOLATILE CACHE...
  • Page 96 RECOVERED DATA WITH ERROR CORR. & RETRIES APPLIED RECOVERED DATA - DATA AUTO-REALLOCATED RECOVERED DATA - RECOMMEND REASSIGNMENT RECOVERED DATA - RECOMMEND REWRITE RECOVERED DATA WITH ECC - DATA REWRITTEN DEFECT LIST ERROR Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 97 Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description DEFECT LIST NOT AVAILABLE DEFECT LIST ERROR IN PRIMARY LIST DEFECT LIST ERROR IN GROWN LIST PARAMETER LIST LENGTH ERROR SYNCHRONOUS DATA TRANSFER ERROR DEFECT LIST NOT FOUND PRIMARY DEFECT LIST NOT FOUND GROWN DEFECT LIST NOT FOUND MISCOMPARE DURING VERIFY OPERATION MISCOMPARE VERIFY OF UNMAPPED LBA...
  • Page 98 COMMAND SEQUENCE ERROR ILLEGAL POWER CONDITION REQUEST PREVIOUS BUSY STATUS PREVIOUS TASK SET FULL STATUS PREVIOUS RESERVATION CONFLICT STATUS ORWRITE GENERATION DOES NOT MATCH COMMANDS CLEARED BY ANOTHER INITIATOR COMMANDS CLEARED BY POWER LOSS NOTIFICATION Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 99 Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description COMMANDS CLEARED BY DEVICE SERVER INCOMPATIBLE MEDIUM INSTALLED CANNOT READ MEDIUM - UNKNOWN FORMAT CANNOT READ MEDIUM - INCOMPATIBLE FORMAT CLEANING CARTRIDGE INSTALLED CANNOT WRITE MEDIUM - UNKNOWN FORMAT CANNOT WRITE MEDIUM - INCOMPATIBLE FORMAT CANNOT FORMAT MEDIUM - INCOMPATIBLE MEDIUM CLEANING FAILURE...
  • Page 100 SCSI PARITY ERROR DATA PHASE CRC ERROR DETECTED SCSI PARITY ERROR DETECTED DURING ST DATA PHASE INFORMATION UNIT IUCRC ERROR DETECTED ASYNCHRONOUS INFORMATION PROTECTION ERROR DETECTED PROTOCOL SERVICE CRC ERROR PHY TEST FUNCTION IN PROGRESS Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 101 Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description SOME COMMANDS CLEARED BY ISCSI PROTOCOL EVENT INITIATOR DETECTED ERROR MESSAGE RECEIVED INVALID MESSAGE ERROR COMMAND PHASE ERROR DATA PHASE ERROR INVALID TARGET PORT TRANSFER TAG RECEIVED TOO MUCH WRITE DATA ACK/NAK TIMEOUT NAK RECEIVED DATA OFFSET ERROR...
  • Page 102 DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH DATA CHANNEL IMPENDING FAILURE SEEK ERROR RATE TOO HIGH DATA CHANNEL IMPENDING FAILURE TOO MANY BLOCK REASSIGNS DATA CHANNEL IMPENDING FAILURE ACCESS TIMES TOO HIGH Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 103 Table 23. SMART ASC/ASCQ error codes and messages (continued) ASCQ Description DATA CHANNEL IMPENDING FAILURE START UNIT TIMES TOO HIGH DATA CHANNEL IMPENDING FAILURE CHANNEL PARAMETRICS DATA CHANNEL IMPENDING FAILURE CONTROLLER DETECTED DATA CHANNEL IMPENDING FAILURE THROUGHPUT PERFORMANCE DATA CHANNEL IMPENDING FAILURE SEEK TIME PERFORMANCE DATA CHANNEL IMPENDING FAILURE SPIN-UP RETRY COUNT DATA CHANNEL IMPENDING FAILURE DRIVE CALIBRATION...
  • Page 104: Monitoring Memory Usage On A File Module

    SA CREATION PARAMETER NOT SUPPORTED AUTHENTICATION FAILED LOGICAL UNIT ACCESS NOT AUTHORIZED SECURITY CONFLICT IN TRANSLATED DEVICE Monitoring memory usage on a file module Use this procedure to monitor memory usage on a file module. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 105: Errors And Messages

    Understanding error codes The Storwize V7000 Unified error codes convey specific information in an alphanumeric sequence. Tip: Search for error codes or event IDs by using EFS on the front. For 66012FC, for example, search on EFS66012FC.
  • Page 106: Originating Role Information

    Optional Ethernet port 7 (Dual Port 10G card) Fibre channel adapter 1 (both ports) – Storage node only Fibre channel adapter 2 (both ports) – Storage node only Bonded device (data0 mgmt0) System x internal hard disk drives Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 107: Originating File Module Specific Software Code - Code

    Table 27. Originating file module specific software code – Code 1, 3, 5. Listing devices for variable C in the specific software code sequence of ABBCDDDD. C = Originating specific software code in sequence ABBCDDDD Code Device Red Hat Linux GPFS CIFS server CTDB...
  • Page 108: Understanding Event Ids

    Unique error code Severity of the error Understanding event IDs The Storwize V7000 Unified messages follow a specific format, which is detailed here. About this task Tip: Search for error codes or event IDs by using EFS on the front. For 66012FC, for example, search on EFS66012FC.
  • Page 109: File Module Hardware Problems

    I for Asynchronous Replication J for SCM L for HSM AK for NDMP v The element nnnn is a 4 digit message number v The element x indicates the severity of the error. The value x can be: A for Action: GUI error messages. The user must perform a specific action. C for Critical: A critical error occurred which must be corrected by the user or system administrator.
  • Page 110: Components Identified As Customer Replaceable Units (Crus) And Field Replaceable Units (Frus)

    “Removing the fan bracket” on page 100 these. “Installing the fan bracket” on page 102 “Removing the IBM virtual media key” on page 103 “Installing the IBM virtual media key” on page 104 “Removing a PCI riser-card assembly” on page 105 “Installing a PCI riser-card assembly”...
  • Page 111 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 112 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 113 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 114 To remove the battery, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 115 9. Remove the battery: a. If there is a rubber cover on the battery holder, use your fingers to lift the battery cover from the battery connector. b. Use one finger to push the battery horizontally away from the PCI riser card in slot 2 and out of its housing.
  • Page 116 In the United States, IBM has established a return process for reuse, recycling, or proper disposal of used IBM sealed lead acid, nickel cadmium, nickel metal hydride, and other battery packs from IBM Equipment. For information on proper disposal of these batteries, contact IBM at 1-800-426-4333.
  • Page 117 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 118 To install the replacement battery, complete the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 119 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 120 Installing the microprocessor 2 air baffle The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 121 To install the microprocessor air baffle, complete the following steps. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 122 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 123 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 124 Removing the fan bracket The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 125 To remove the fan bracket, complete the following steps. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 126 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 127 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 128 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 129 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 130 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 131 Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Reinstall any adapters you removed in other procedures.
  • Page 132 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 133 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 134 Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Install the adapter in the expansion slot.
  • Page 135 If you replace a fibre channel adapter within a storage node, the WWPNs change. For 2851-DR1/DE1 attached storage the WWPN updates are automatic. If the attached storage unit is a gateway configuration (consisting of IBM XIV Storage System, V7000, or SAN Volume Controller), the WWPN update is not automatic.
  • Page 136 These installation instructions show the slot location for the 10-Gbps Ethernet PCI adapter. About this task The 10-Gbps Ethernet adapter must go in PCI slot 4. The following illustration shows the locations of the adapter expansion slots from the rear of the file module. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 137 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 138 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 139: Location Of The Ethernet Adapter Filler Panel On The Chassis

    Ethernet adapter filler panel Standoff Rubber stopper Figure 14. Location of the Ethernet adapter filler panel on the chassis 6. Install the two standoffs on the system board. 7. Insert the bottom tabs of the metal clip into the port openings from outside the chassis.
  • Page 140: Aligning The Ethernet Adapter Port Connectors

    Attention: Make sure the port connectors on the adapter are aligned properly with the chassis on the rear of the server. An incorrectly seated adapter might cause damage to the system board or the adapter. Figure 18. Port connector alignment Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 141 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 142: Server Model

    To remove the SAS riser-card and controller assembly from the server, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 143: Tape-Enabled Server Model

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 144: Sas Riser-Card And Controller Assembly On The 16-Drive-Capable Server Model

    SAS controller. See Figure 22. Figure 22. Controller retention brackets on 16-drive-capable server model 1) Remove the SAS controller front retention bracket from the server. See Figure 23 on page 121. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 145: Sas Controller Front Retention Brackets

    SAS expander card front retention bracket Figure 23. SAS controller front retention brackets 2) Remove the rear controller retention bracket located in the battery bay above the power supplies by pulling up the release tab 1 and sliding the bracket outward 2 . See Figure 24. Figure 24.
  • Page 146: Installing The Controller Retention Bracket

    3. To install the SAS riser-card and controller assembly for a tape-enabled server model, complete the following steps. Figure 27 on page 123 shows the SAS riser card in the tape-enabled server model. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 147: Sas Riser-Card Assembly On Tape-Enabled

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 148 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 149 Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Touch the static-protective package that contains the new ServeRAID SAS controller to any unpainted metal surface on the file module.
  • Page 150 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 151 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 152 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 153: Serveraid M1000 Advanced Feature Key And

    ServeRAID M1000 advanced feature key ServeRAID-M1015 adapter Figure 28. ServeRAID M1000 advanced feature key and M1015 adapter Chapter 4. File module...
  • Page 154: Serveraid M5000 Advanced Feature Key And

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 155: Serveraid M1000 Advanced Feature Key And

    ServeRAID M1000 advanced feature key ServeRAID-M1015 adapter Figure 30. ServeRAID M1000 advanced feature key and M1015 adapter Chapter 4. File module...
  • Page 156: Serveraid M5000 Advanced Feature Key And

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 157: Releasing The Battery Retention Clip

    3. Press down on the left and right side latches and pull the server out of the rack enclosure until both slide rails lock. 4. Remove the cover, as described in “Removing the cover” on page 87. 5. Locate the remote battery tray in the server and remove the battery that you want to replace.
  • Page 158: Removing The Battery From The Battery Carrier

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 159: Connecting The Remote Battery Cable

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 160 To remove the CD-RW/DVD drive, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 161 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 162 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 163: Dimm Locations For The Storwize V7000 Unified System X3650 M2 Server

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 164 To install a DIMM, complete the following procedure. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 165 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 166 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 167: System Board Fan Locations

    The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 168 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 169: System Status With 460-Watt Power Supplies

    Make sure that the devices that you are installing are supported. For a list of supported devices for the server, see “Parts listing for file modules” in the IBM Storwize V7000 Unified Information Center. v Before you install an additional power supply or replace a power supply with one of a different wattage, you may use the IBM Power Configurator utility to determine current system power consumption.
  • Page 170 To install an ac power supply, complete the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 171 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 172 The following procedure is for a Tier 1 customer replaceable unit (CRU). Replacement of Tier 1 CRUs is your responsibility. If IBM installs a Tier 1 CRU at your request, you will be charged for the installation. Service agreements can be purchased so that you can ask IBM to replace these units.
  • Page 173: Heat-Sink Release Lever

    To remove a microprocessor and heat sink, complete the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 174: Microprocessor Release Latch

    About this task Read the documentation that comes with the microprocessor to determine whether you must update the IBM System x Server Firmware. To download the most current level of server firmware, complete the following steps: 1. Go to http://www.ibm.com/systems/support/.
  • Page 175 Note: For simplicity, certain components are not shown in this illustration. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 176 8. Twist the handle of the installation tool clockwise to secure the microprocessor in the tool. Note: You can pick up or release the microprocessor by twisting the microprocessor installation tool handle. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 177 Handle Installation tool Microprocessor 9. Carefully align the microprocessor installation tool over the microprocessor socket. Twist the handle of the microprocessor tool counterclockwise to insert the microprocessor into the socket. Attention: The microprocessor fits only one way on the socket. You must place a microprocessor straight down on the socket to avoid damaging the pins on the socket.
  • Page 178: Bottom Surface Of The Heat Sink

    21. Install the cover, as described in “Installing the cover” on page 88. 22. Slide the server into the rack. 23. Reconnect the external cables; then, reconnect the power cords and turn on the peripheral devices and the file module. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 179 Note: You must wait approximately 2.5 minutes after you connect the power cord of the file module to an electrical outlet before the power-control button becomes active. Removing and replacing the thermal grease The following procedure is for a field replaceable unit (FRU). FRUs must be installed only by trained service technicians.
  • Page 180 Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58. 2. Turn off the file module and peripheral devices, then label and disconnect both power cords and all external cables.
  • Page 181 8. If you are instructed to return the heat-sink retention module, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a heat-sink retention module The following procedure is for a field replaceable unit (FRU). FRUs must be installed only by trained service technicians.
  • Page 182 To remove the system board, complete the following steps. Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 183 9. If an Ethernet daughter card is installed in the server, remove it. 10. If a virtual media key is installed in the server, remove it, as described in “Removing the IBM virtual media key” on page 103. 11. Remove the DIMM air baffle, as described in “Removing the DIMM air baffle”...
  • Page 184 Note: You must wait approximately 2.5 minutes after you connect the power cord of the file module to an electrical outlet before the power-control button becomes active. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 185 [root@PFESONAS1.mgmt001st001 ~]# asu show BootOrder.BootOrder IBM Advanced Settings Utility version 3.62.71B Licensed Materials - Property of IBM (C) Copyright IBM Corp. 2007-2010 All Rights Reserved Successfully discovered the IMM via SLP. Discovered IMM at IP address 169.254.95.118 Connected to IMM at IP address 169.254.95.118 BootOrder.BootOrder=Legacy Only=CD/DVD Rom=Floppy Disk=PXE Network=Hard Disk 0...
  • Page 186: Va Safety Cover

    To remove the 240 VA safety cover, perform the following steps: Procedure 1. To help you work safely with Storwize V7000 Unified file module, read the safety information in “Safety” on page xi, “Safety statements” on page xiii, and “Installation guidelines” on page 58.
  • Page 187: Va Safety Cover

    About this task The ASU package is part of the Storwize V7000 Unified code. ASU is available to authorized service personnel from the command-line interface (CLI) on the file module. Use ASU to modify selected settings in the integrated-management- module (IMM)-based Storwize V7000 Unified file modules.
  • Page 188: How To Reset/Reboot Server Imm Interface

    2. Issue the following command to view the current settings for the machine type and model: asu show SYSTEM_PROD_DATA.SysInfoProdName 3. Issue the ASU command on the Storwize V7000 Unified file module to set the machine type and model: asu set SYSTEM_PROD_DATA.SysInfoProdName 2073-700 4.
  • Page 189: File Module Software Problems

    About this task Logical devices and physical port locations Use this table to help identify logical devices, file module roles used, and physical locations. Table 33. Storwize V7000 Unified logical devices and physical port locations Logical Ethernet device name Device description...
  • Page 190: Hostname And Service Ip Reference

    About this task If both file modules are operating correctly with regard to management services, perform the following procedure to failover the active management node to the passive management node. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 191 If you see the following error message when running the command, wait until the initialization has completed before running setcluster again: IBM SONAS management service is starting up EFSSG0654I The Management Service is starting up. After you run the startmgtsrv command, the system displays information that is similar to the following example: [yourlogon@yourmachine.mgmt002st001 ~]# startmgtsrv...
  • Page 192 7. Run the CLI command startmgtsrv. This starts the management services on the passive node. 8. Once command execution is complete: a. Verify that the management service is running by again executing the CLI command lsnode. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 193: Checking Ctdb Health

    Use this information for checking system health with the clustered trivial database (CTDB). About this task CTDB checks the health status of the Storwize V7000 Unified file modules, scanning elements such as storage access, General Parallel File System (GPFS), networking, Common Internet File System (CIFS), and Network File System (NFS).
  • Page 194: Management Gui Showing Ctdb Status For

    “Checking the GPFS file system mount on each file module” on page 171. v Refer to the information in “Troubleshooting the System x3650” in the IBM Storwize V7000 Unified Information Center to determine if any additional hardware problems might be causing the “unhealthy”...
  • Page 195: Checking The Gpfs File System Mount On Each File Module

    System (GPFS) file system mounts on IBM Storwize V7000 Unified file modules. About this task A GPFS file system that is not mounted on an Storwize V7000 Unified file module can cause the clustered trivial database (CTDB) status to be 'UNHEALTHY'." The...
  • Page 196: Resolving Problems With Missing Mounted File Systems

    2. To identify the currently created file systems on each Storwize V7000 Unified file module, log in as the root user on the active management node, then enter the onnode -n mgmt001st001 df | grep ibm command from the CLI, as shown...
  • Page 197: Resolving Stale Nfs File Systems

    If file systems remain unmounted, contact IBM support. Resolving stale NFS file systems You can resolve problems with stale NFS file systems on Storwize V7000 Unified file modules. A file module might have the file system mounted, but the file system remains inaccessible due to a stale NFS file handle.
  • Page 198: If "Netgroup" Functionality With Nis Or Ldap Is Not Working

    Command_Output_Data Home_Directory Template_Shell FETCH USER INFO SUCCEED 12004360 12000513 /var/opt/IBM/sofs/scproot /usr/bin/rssh EFSSG1000I The command completed successfully. When the system is unable to authenticate against an external authentication server, you must ensure that it can obtain user information from the authentication server.
  • Page 199: Trouble Accessing Exports When Server And Client Configurations Are Correct

    This can cause some clients have access while others do not. Procedure 1. To obtain the IP addresses of your Storwize V7000 Unified cluster, issue the nslookup command; this non-disruptive command requires “root” access and your domain name. .
  • Page 200: Checking Network Interface Availability

    You are running this procedure on a file module. v You are logged into the file module, which is the active management node, as root. See “Accessing a file module as root” on page 273. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 201 4. Issue the chkfs file_system_name -v | tee /ftdc/chkfs_fs_name.log1 command to capture the output to a file. Review the output file for errors and save it for IBM support to investigate any problems. If the file contains a TSM ERROR message, perform the following steps: a.
  • Page 202: Resolving An Ans1267E Error

    Resolving issues reported by lshealth Use this information to resolve lshealth reported issues, specifically for “MGMTNODE_REPL_STATE ERROR DATABASE_REPLICATION_FAILED” and “The mount state of the file system /ibm/Filesystem_Name changed to error level” errors. About this task These errors might be transient and can clear automatically at any time.
  • Page 203: Resolving Network Errors

    4. Issue the command lshealth -i gpfs_fs -r. The command output should display The mount state of the file system /ibm/gpfs1 was set back to normal level. 5. If the error persists, refer to the GPFS documentation to debug or repair the error.
  • Page 204: Resolving Full Condition For Gpfs File System

    2. If there is no space in fragments or if the mmdefragfs command does not free up space, add disks (NSDs) to the file system to create space. a. Add disks to the file system. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 205: Analyzing Gpfs Logs

    Kerberos tickets, for example, can expire and then no one can access the cluster. For the Storwize V7000 Unified file module, the ntpq –p command shows you which server is used for synchronization and any peers and a set of data about their status.
  • Page 206 [root@domain.node ~]# service ntpd start Starting ntpd: [ OK ] [root@domain.node ~]# After the time on all of the servers is synchronized, you can verify that the logs apply to your troubleshooting situation. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 207: Chapter 5. Control Enclosure

    You cannot manage a system by using the 10 Gbps Ethernet ports. You can perform almost all of the configuration, troubleshooting, recovery, and maintenance of the storage system from within the Storwize V7000 Unified management GUI or the CLI commands that are running on the Storwize V7000 file modules.
  • Page 208 When you cannot access the system from the management GUI and you cannot access the storage Storwize V7000 Unified to run the recommended actions v When the recommended action directs you to use the service assistant. The storage system management GUI operates only when there is an online system.
  • Page 209: Storage System Command-Line Interface

    Accessing the storage system CLI Follow the steps that are described in the “Command-line interface” topic in the “Reference” section of the Storwize V7000 Unified Information Center to initialize and use a CLI session. Chapter 5. Control enclosure...
  • Page 210: Service Command-Line Interface

    Accessing the service CLI Follow the steps that are described in the “Command-line interface” topic in the “Reference” section of the Storwize V7000 Unified Information Center to initialize and use a CLI session. USB flash drive and Initialization tool interface Use a USB flash drive to initialize a system and also to help service the node canisters in a control enclosure.
  • Page 211 USB flash drive, you can download the application from the support website (search for initialization tool): www.ibm.com/storage/support/storwize/v7000/unified If you download the initialization tool, you must copy the file onto the USB flash drive that you are going to use.
  • Page 212 Use the chsystemip CLI command to change the managed gateway IP address setting on the control enclosure. (This must be done first before you change the management gateway IP address setting on the file modules): [kd52v6h.ibm]$ chsystemip -gw 9.71.16.1 -port 1 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 213 You should be able to access the management GUI or CLI from a computer, which is on a different subnet or different Ethernet switch to the Storwize V7000 Unified system. The link to the management GUI from the InitTool.exe panel should now work.
  • Page 214 Use this command when you are unable to logon to the system because you have forgotten the superuser password, and you wish to reset it. Attention: Run this command only when instructed by IBM support. Running this command directly on a Storwize V7000 can affect your I/O operations on the file modules.
  • Page 215 Use this command to collect diagnostic information from the node canister and to write the output to a USB flash drive. Attention: Run this command only when instructed by IBM support. Running this command directly on a Storwize V7000 can affect your I/O operations on the file modules.
  • Page 216 Note: The reference to cluster is not the same as the file system cluster on the Storwize V7000 file modules. Attention: Run this command only when instructed by IBM support. Running this command directly on a Storwize V7000 can affect your I/O operations on the file modules.
  • Page 217: Event Reporting

    If any service activity is required, a notification is sent. Event reporting process The following methods are used to notify you and the IBM Support Center of a new event: v If you enabled Simple Network Management Protocol (SNMP), an SNMP trap is sent to an SNMP manager that is configured by the customer.
  • Page 218: Description Of Data Fields For The Event Log

    Resolve the root event first. Sense data Additional data that gives the details of the condition that caused the event to be logged. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 219: Event Notifications

    Critical notifications can be configured to be sent as a Call Home email to the IBM Support Center. Warning A warning notification is sent to indicate a problem or unexpected condition with the Storwize V7000 Unified.
  • Page 220: Understanding Events

    You can view information about collecting CIM log files or you can view examples of a configuration dump, error log, or featurization log. To do this, click Reference in the left pane of the Storwize V7000 Unified Information Center and then expand the Logs and traces section.
  • Page 221 There are two power supply units in the control enclosure. Each one contains an integrated battery. Both power supply units and batteries provide power to both control canisters. Each battery has a sufficient charge to power both node canisters for the duration of saving critical data to the local drive. In a fully redundant system with two batteries and two canisters, there is enough charge in the batteries to support saving critical data from both canisters to a local drive twice.
  • Page 222: Maintenance Discharge Cycles

    Important: Although Storwize V7000 Unified is resilient to power failures and brown outs, always install Storwize V7000 Unified in an environment where there is reliable and consistent ac power that meets the Storwize V7000 Unified requirements.
  • Page 223: Understanding The Medium Errors And Bad Blocks

    Understanding the medium errors and bad blocks A storage system returns a medium error response to a host when it is unable to successfully read a block. The Storwize V7000 Unified response to a host read follows this behavior. The volume virtualization that is provided extends the time when a medium error is returned to a host.
  • Page 224: Resolving A Problem

    The Start here: Use the management GUI recommended actions topic gives the starting point for any service action. The situations covered in this section are the Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 225: Start Here: Use The Management Gui Recommended Actions

    The management GUI provides extensive facilities to help you troubleshoot and correct problems on your system. You can connect to and manage a Storwize V7000 Unified system using the management GUI as soon as you have created a clustered system. If you cannot create a clustered system, see the problem that contains information about what to do if you cannot create one.
  • Page 226: Problem: Another System May Be Using The System Ip Address

    Update the file module's record of the control enclosure system IP: To find the file module's current record of the control enclosure system IP address, use the Storwize V7000 Unified management CLI to issue the lsstoragesystem command. Here is an example: >ssh admin@<management_IP>...
  • Page 227: Problem: Unable To Change The System Ip Address Because You Cannot Access The Cli

    >[kd01ghf.ibm]$ chstoragesystem --ip1 9.71.18.136 --ip2 9.71.18.136 EFSSG1000I The command completed successfully. Verify that communication from the file module to the control enclosure is now possible by running the lssystem command on the Storwize V7000 Unified management CLI: >ssh admin@<managment IP address>...
  • Page 228: Problem: Management Ip Address Unknown

    Updating file module's record of the control enclosure system IP: To find the USB flash drive current record of the control enclosure system IP address, use the Storwize V7000 Unified management CLI to issue the lsstoragesystem command. Here is an example: >ssh admin@<management_IP>...
  • Page 229: Problem: Unable To Log On To The Management Gui

    of both node canisters is candidate, then there is not a clustered system to connect to. If the node state is service, go to “Procedure: Fixing node errors” on page 220. v Ensure that you are using the correct system IP address. If you know the service address of a node canister, go to “Procedure: Getting node canister and system information using the service assistant”...
  • Page 230: Problem: Node Canister Service Ip Address Unknown

    1. Point your browser at the /service directory of the management IP address of the system. If your management IP address is 11.22.33.44, point your web browser to 11.22.33.44/service. 2. Log into the service assistant. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 231: Problem: Cannot Connect To The Service Assistant

    3. The service assistant home page lists the node canister that can communicate with the node. 4. If the service address of the node canister that you are looking for is listed in the Change Node window, make the node the current node. Its service address is listed under the Access tab of the node details.
  • Page 232: Problem: Management Gui Or Service Assistant Does Not Display Correctly

    Problem: SAS cabling not valid This topic provides information to be aware of if you receive errors that indicate the SAS cabling is not valid. Check the following items: Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 233: Problem: New Expansion Enclosure Not Detected

    v No more than five expansion enclosures can be chained to port 1 (below the control enclosure). The connecting sequence from port 1 of the node canister is called chain 1. v No more than four expansion enclosures can be chained to port 2 (above the control enclosure).
  • Page 234: Problem: Command File Not Processed From Usb Flash Drive

    About this task If you are having problems attaching to the FCoE hosts, your problem might be related to the network, the Storwize V7000 Unified system, or the host. Procedure 1. If you are seeing error code 705 on the node, this means Fibre Channel I/O port is inactive.
  • Page 235: Procedure: Resetting Superuser Password

    Verify that Storwize V7000 Unified and host get an fcid on FCF. If not, check the VLAN configuration. b. Verify that Storwize V7000 Unified and host port are part of a zone and that zone is currently in force.
  • Page 236: Procedure: Checking The Status Of Your System

    The Node tab shows general information about the node canister that includes the node state and whether it is a configuration node. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 237: Procedure: Getting Node Canister And System Information Using A Usb Flash Drive

    v The Hardware tab shows information about the hardware. v The Access tab shows the management IP addresses and the service addresses for this node. v The Location tab identifies the enclosure in which the node canister is located. v The Ports tab shows information about the I/O ports. Procedure: Getting node canister and system information using a USB flash drive This procedure explains how to view information about the node canister and...
  • Page 238: Leds On The Power Supply Units Of The Control Enclosure

    LEDs on the power supply unit for the 2076-112 or 2076-124. The LEDs on the power supply units for the 2076-312 and 2076-324 are similar, but they are not shown here. Figure 47. LEDs on the power supply units of the control enclosure Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 239 Table 38. Power-supply unit LEDs Power supply ac failure dc failure failure Status Action Communication Replace the power failure between supply unit. If failure is the power still present, replace the supply unit and enclosure chassis. the enclosure chassis No ac power to Turn on power.
  • Page 240: Leds On The Node Canisters

    If the power LEDs show green, reseat the node canister. See “Procedure: Reseating a node canister” on page 222. If the LED status does not change, see “Replacing a node canister” on page 224. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 241 Table 40. System status and fault LEDs (continued) System status Fault LED Status Action Code is not Follow the hardware replacement active. The BIOS procedures for the node canister. or the service processor has detected a hardware fault. Code is active. No action.
  • Page 242: Procedure: Finding The Status Of The Ethernet Connections

    1. Verify that each end of the cable is securely connected. 2. Verify that the port on the Ethernet switch or hub is configured correctly. 3. Connect the cable to a different port on your Ethernet network. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 243: Procedure: Removing System Data From A Node Canister

    4. If the status is obtained using the USB flash drive, review all the node errors that are reported. 5. Replace the Ethernet cable. Procedure: Removing system data from a node canister This procedure guides you through the process to remove system information from a node canister.
  • Page 244: Procedure: Fixing Node Errors

    You can set an IPv4 address, an IPv6 address, or both, as the service address of a node. Enter the required address correctly. If you set the address to 0.0.0.0 or 0000:0000:0000:0000:0000:0000:0000, you disable the access to the port on that protocol. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 245: Procedure: Accessing A Canister Using A Directly Attached Ethernet Cable

    Procedure Change the service IP address. v Use the control enclosure management GUI when the system is operating and the system is able to connect to the node with the service IP address that you want to change. 1. Select Settings > Network from the navigation. 2.
  • Page 246: Procedure: Reseating A Node Canister

    Results Procedure: Powering off your system Use this procedure to power off your Storwize V7000 Unified system when it must be serviced or to permit other maintenance actions in your data center. To turn off the Storwize V7000 Unified system, see “Turning off the system” in the Storwize V7000 Unified information center.
  • Page 247: Procedure: Collecting Information For Support

    About this task Procedure: Collecting information for support IBM support might ask you to collect trace files and dump files from your system to help them resolve a problem. Typically, you perform this task from the Storwize V7000 Unified management GUI. You can also collect information from the Storwize V7000 control enclosure itself.
  • Page 248: Removing And Replacing Parts

    Before you remove and replace parts, you must be aware of all safety issues. Before you begin First, read the safety precautions in the IBM Systems Safety Notices. These guidelines help you safely work with the Storwize V7000 Unified. Replacing a node canister This topic describes how to replace a node canister.
  • Page 249: Rear Of Node Canisters That Shows The

    v If the system status is off, it is acceptable to remove a node canister. However, do not remove a node canister unless directed to do so by a service procedure. v If the power LED is flashing or off, it is safe to remove a node canister. However, do not remove a node canister unless directed to do so by a service procedure.
  • Page 250: Replacing An Expansion Canister

    Be careful when you are replacing the hardware components that are located in the back of the system that you do not inadvertently disturb or remove any cables that you are not instructed to remove. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 251: Rear Of Expansion Canisters That Shows The

    Be aware of the following canister LED states: v If the power LED is on, do not remove an expansion canister unless directed to do so by a service procedure. v If the power LED is flashing or off, it is safe to remove an expansion canister. However, do not remove an expansion canister unless directed to do so by a service procedure.
  • Page 252: Replacing An Sfp Transceiver

    Be careful when you are replacing the hardware components that are located in the back of the system that you do not inadvertently disturb or remove any cables that you are not instructed to remove. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 253 CAUTION: Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following information: laser radiation when open. Do not stare into the beam, do not view directly with optical instruments, and avoid direct exposure to the beam. (C030) About this task Perform the following steps to remove and then replace an SFP transceiver: Procedure...
  • Page 254: Replacing A Power Supply Unit For A Control Enclosure

    Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 255 Attention: If your system is powered on and performing I/O operations, go to the management GUI and follow the fix procedures. Performing the replacement actions without the assistance of the fix procedures can result in loss of data or access to data. Attention: A powered-on enclosure must not have a power supply removed for more than five minutes because the cooling does not function correctly with an empty slot.
  • Page 256: Directions For Lifting The Handle On The Power

    6. Insert the replacement power supply unit into the enclosure with the handle pointing towards the center of the enclosure. Insert the unit in the same orientation as the one that you removed. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 257: Replacing A Power Supply Unit For An Expansion Enclosure

    7. Push the power supply unit back into the enclosure until the handle starts to move. 8. Finish inserting the power supply unit into the enclosure by closing the handle until the locking catch clicks into place. 9. Reattach the power cable and cable retention bracket. 10.
  • Page 258 Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 259 Attention: A powered-on enclosure must not have a power supply removed for more than five minutes because the cooling does not function correctly with an empty slot. Ensure that you have read and understood all these instructions and have the replacement available, and unpacked, before you remove the existing power supply.
  • Page 260: Directions For Lifting The Handle On The Power

    6. Insert the replacement power supply unit into the enclosure with the handle pointing towards the center of the enclosure. Insert the unit in the same orientation as the one that you removed. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 261: Replacing A Battery In A Power Supply Unit

    7. Push the power supply unit back into the enclosure until the handle starts to move. 8. Finish inserting the power supply unit in the enclosure by closing the handle until the locking catch clicks into place. 9. Reattach the power cable and cable retention bracket. 10.
  • Page 262 Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 263 The battery is a lithium ion battery. To avoid possible explosion, do not burn. Exchange only with the IBM-approved part. Recycle or discard the battery as instructed by local regulations. In the United States, IBM has a process for the collection of this battery. For information, call 1-800-426-4333. Have the IBM part number for the battery unit available when you call.
  • Page 264: Removing The Battery From The Control

    Remove the battery from the packaging. b. Remove the end caps. c. Attach the end caps to both ends of the battery that you removed and place the battery in the original packaging. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 265: Releasing The Cable Retention Bracket

    d. Place the replacement battery in the opening on top of the power supply in its proper orientation. e. Press the battery to seat the connector. f. Place the handle in its downward location 5. Push the power supply unit back into the enclosure until the handle starts to move.
  • Page 266: Unlocking The 3.5" Drive

    224 refers. 2. Unlock the assembly by squeezing together the tabs on the side. Figure 59. Unlocking the 3.5" drive 3. Open the handle to the full extension. Figure 60. Removing the 3.5" drive Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 267: Replacing A 2.5" Drive Assembly Or Blank Carrier

    4. Pull out the drive. 5. Push the new drive back into the slot until the handle starts to move. 6. Finish inserting the drive by closing the handle until the locking catch clicks into place. Replacing a 2.5" drive assembly or blank carrier This topic describes how to remove a 2.5"...
  • Page 268: Replacing Enclosure End Caps

    4. Fit the slot that is on the top of the end cap over the tab on the top of the chassis flange. 5. Rotate the end cap down until it snaps into place. Make sure that the inside surface of the end cap is flush with the chassis. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 269: Replacing A Sas Cable

    Attention: The left end cap is printed with information that helps identify the enclosure. v machine type and model v enclosure serial number v its machine part number The information on the end cap should always match the information printed on the rear of the enclosure, and it should also match the information that is stored on the enclosure midplane.
  • Page 270: Replacing A Control Enclosure Chassis

    The procedures for replacing a control enclosure chassis are different from those procedures for replacing an expansion enclosure chassis. For information about replacing an expansion enclosure chassis, see “Replacing an expansion enclosure chassis” on page 251. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 271 Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 272 Attention: Perform this procedure only if instructed to do so by a service action or the IBM support center. If you have a single control enclosure, this procedure requires that you shut down your system to replace the control enclosure. If you...
  • Page 273 b. Use the following CLI command to find the volumes that depend on this enclosure: lsdependentvdisks -enclosure <enclosure_id> Dependent volume names that start with IFS are file volumes that are used by the file modules to provide file systems. Turn off these file modules.
  • Page 274 “Procedure: Fixing node errors” on page 220. To restart a node from the service assistant, perform the following steps: 1) Log on to the service assistant. 2) From the home page, select the node that you want to restart from the Changed Node List. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 275: Replacing An Expansion Enclosure Chassis

    3) Select Actions > Restart. d. The system starts and can handle I/O requests from the host systems. Note: The configuration changes that are described in the following steps must be performed to ensure that the system is operating correctly. If you do not perform these steps, the system is unable to report certain errors.
  • Page 276 Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v Connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product.
  • Page 277 Attention: If your system is powered on and performing I/O operations, go the management GUI and follow the fix procedures. Performing the replacement actions without the assistance of the fix procedures can result in loss of data or access to data. Even though many of these procedures are hot-swappable, these procedures are intended to be used only when your system is not up and running and performing I/O operations.
  • Page 278: Replacing The Support Rails

    2. Record the location of the rail assembly in the rack cabinet. 3. Working from the back of the rack cabinet, remove the clamping screw 1 from the rail assembly on both sides of the rack cabinet. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 279: General Storage System Procedures

    Figure 64. Removing a rail assembly from a rack cabinet 4. Working from the front of the rack cabinet, remove the clamping screw from the rail assembly on both sides of the rack cabinet. 5. From one side of the rack cabinet, grip the rail and slide the rail pieces together to shorten the rail.
  • Page 280: San Problem Determination

    1. Ensure that the Fibre Channel cable is securely connected at each end. 2. Replace the Fibre Channel cable. 3. Replace the SFP transceiver for the failing port on the Storwize V7000 Unified Storwize V7000 Unified node. Note: Storwize V7000 Unified nodes are supported with both longwave SFP transceivers and shortwave SFP transceivers.
  • Page 281: Ethernet Iscsi Host-Link Problems

    Ethernet iSCSI host-link problems If you are having problems attaching to the Ethernet hosts, your problem might be related to the network, the Storwize V7000 Unified system, or the host. Before you begin For network problems, you can attempt any of the following actions: v Test your connectivity between the host and Storwize V7000 Unified ports.
  • Page 282: When To Run The Recover System Procedure

    Attention: If you experience failures at any time while you are running the recover system procedure, call the IBM Support Center. Do not attempt to do further recovery actions because these actions might prevent IBM Support from restoring the system to an operational status.
  • Page 283 Certain conditions must be met before you run the recovery procedure. Use the following items to help you determine when to run the recovery procedure: v Check to see if any node in the system has a node status of active. This status means that the system is still available.
  • Page 284: Fix Hardware Errors

    Note: If after resolving all these scenarios, half or greater than half of the nodes are reporting node error 578, it is appropriate to run the recovery procedure. You can also call IBM Support for further assistance. – For any nodes that are reporting a node error 550, ensure that all the missing hardware that is identified by these errors is powered on and connected without faults.
  • Page 285: Performing System Recovery Using The Service Assistant

    Attention: This service action has serious implications if not performed properly. If at any time an error is encountered not covered by this procedure, stop and call IBM Support. Note: The web browser must not block pop-up windows, otherwise progress windows cannot open.
  • Page 286 “Recovering from offline VDisks using the CLI” on page 263 for details. T3 failed Call IBM Support. Do not attempt any further action. Run the recovery from any node canisters in the system; the node canisters must not have participated in any other system.
  • Page 287: Recovering From Offline Vdisks Using The Cli

    Perform the following steps to recover an offline volume after the recovery procedure has completed: 1. Delete all IBM FlashCopy function mappings and Metro Mirror or Global Mirror relationships that use the offline volumes. 2. Run the recovervdisk or recovervdiskbysystem command.
  • Page 288: Backing Up And Restoring The System Configuration

    Before using the file volumes that are used by GPFS on the file modules to provide Network Attached Storage (NAS), perform the following task: v Contact IBM support for assistance with recovering the GPFS quorum state so that access to files as NAS can be restored.
  • Page 289 Contact the IBM support center to help you prepare the Storwize V7000 Unified system to do the restoring of the system configuration on the control enclosure.
  • Page 290: Backing Up The System Configuration Using The Cli

    Typically the restoration should be performed via canister 1. Before you begin, hardware recovery must be complete. The following hardware must be operational: hosts, Storwize V7000 Unified, drives, the Ethernet network, and the SAN fabric. Backing up the system configuration using the CLI You can back up your configuration data using the command-line interface (CLI).
  • Page 291 data. This can be attempted via the <Recover System Procedure> also known as a Tier 3 (T3) procedure. Restoring the system configuration without attempting to recover the application data is performed via the <Restoring the System Configuration> procedure also known as a Tier 4 (T4) recovery. Both of these procedures require a recent backup of the configuration data.
  • Page 292: Deleting Backup Configuration Files Using The Cli

    2. Issue the following CLI command to erase all of the files that are stored in the /tmp directory: svconfig clear -all Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 293: Chapter 6. Call Home And Remote Support

    6. Save the new configuration by clicking the OK button. Results Configuring the remote support system IBM Storwize V7000 Unified uses IBM Tivoli Assist On Site software to establish remote connections to IBM support representatives. Establishing an AOS connection Use this information to establish an AOS connection with IBM remote support for diagnosing and reviewing issues and problems on your system.
  • Page 294 Storwize V7000 Unified system. About this task Configure the system for a lights-out connection using the Enable IBM Tivoli Assist On-Site (AOS) task. After you configure the system, no other tasks are needed. The remote support contact might ask you for machine information, such as machine type and models, serial numbers, and your machine name.
  • Page 295 Enter the customer name, the case number (use the PMR number), and the geography. f. Talk to the IBM authorized servicer at the customer site to make sure that the servicer is ready to establish the link before you submit the form.
  • Page 296 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 297: Chapter 7. Recovery Procedures

    1. Check that the system is in good health by using the management GUI. Fix any hardware errors that do not require the root password. 2. Use the management GUI to identify the file module that is not the active management node and plug the KVM into that file module. © Copyright IBM Corp.
  • Page 298 8. From the KVM where you logged on as root, use the chrootpwd command to change the root password on both file modules. Results The chrootpwd program prompts you for the new root password. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 299: Resetting The Nas Ssh Key For Configuration Communications

    SCSI protocol. Before you begin During the USB initialization of the Storwize V7000 Unified system, one of the node canisters in the control enclosure creates a public/private key pair to use for ssh. The node canister stores the public key and writes the private key to the USB flash drive memory.
  • Page 300: Working With File Modules That Report A Stale Nfs File Handle

    - /sharename The ls command can return the following error: ls: .: Stale NFS file handle The Storwize V7000 Unified system hosting file module might display the following error: mgmt002st001 mountd[3055867]: refused mount request from hostname for sharename (/): not exported If one of these errors occurs, complete the following steps.
  • Page 301: File Module-Related Issues

    This section covers the recovery procedures related to file module issues. Restoring System x firmware (BIOS) settings During critical repair actions such as the replacement of a system planar in an IBM Storwize V7000 Unified file module, you might have to reset the System x firmware.
  • Page 302: Recovering From File Systems That Are Offline After The Volumes Came Back Online

    13. Press ESC or click Exit Setup, and then press Enter. 14. When prompted, click Y to exit the setup menu. The system now reboots. During the reboot, the Storwize V7000 Unified code automatically modifies the configuration of the System x firmware (BIOS) to change the default settings to the required settings.
  • Page 303: Recovering From An Nfsd Service Error

    Use this procedure after completing the procedure in “Fibre Channel connectivity between file modules and control enclosure” on page 34. The Storwize V7000 Unified system can experience problems where the multipathd failures occur. If the paths are not automatically restored, a system reboot can recover the paths.
  • Page 304: Recovering From An Scm Error

    Use this procedure to recover from an httpd service error when the service is reported as unhealthy or off. About this task Procedure To fix the httpd error, perform the following steps: 1. Attempt to start the http service manually. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 305: Recovering From An Sshd_Data Service Error

    a. Log in as root. b. Issue the service http start command. 2. When you complete the service action, refer to “Health status and recovery” on page 23. Recovering from an sshd_data service error Use this procedure to recover from an sshd_data service error. About this task This recovery procedure starts the sshd_data when it is down.
  • Page 306: Control Enclosure-Related Issues

    About this task Procedure To run the fix procedures, perform the following steps: 1. Log in to the Storwize V7000 Unified management GUI. 2. Go to Monitoring > Events and click the Block tab. 3. Run any Next recommended action.
  • Page 307: Recovering From Offline Compressed Volumes

    Point in time block copies are a good candidate for deletion. Storwize V7000 Unified can virtualize external block storage controllers. If spare capacity is available on other block storage controllers then you can virtualize those and use that free local arrays.
  • Page 308: Recovering From A 1001 Error Code

    You can immediately remount any remaining unmounted file systems without waiting for IBM support to tell you that it is safe for you to re-enable the control enclosure CLI. Note: The management GUI can become very slow when the control enclosure CLI is restricted, so the following procedure shows how to use the management CLI to check if the file systems are mounted.
  • Page 309 -r -n <node name of the active mode> initnode -r 4. Log back on to the Storwize V7000 Unified CLI. Then wait for GPFS to be active on both file modules in the output of the CLI command: Chapter 7. Recovery procedures...
  • Page 310: Restoring Data

    CLI is restricted. When you log on to the management GUI, it issues a warning that the Storwize V7000 CLI is restricted. The management GUI runs the fix procedure to direct you to send logs to IBM. The fix procedure directs you back to this procedure to make the file systems accessible again.
  • Page 311: Restoring Tivoli Storage Manager Data

    Site A by using the rmtask CLI command. Restoring Tivoli Storage Manager data The Storwize V7000 Unified system contains a Tivoli Storage Manager client that works with your Tivoli Storage Manager server system to perform high-speed data backup and recovery operations.
  • Page 312: Upgrade Recovery

    2. After each recommended fix, restart the upgrade by issuing the applysoftware command again. If the action fails, try the next recommended action. 3. If the recommended actions fail to resolve the issue, call the IBM Support Center. Table 43. Upgrade error codes from using the applysoftware command and recommended...
  • Page 313 Table 43. Upgrade error codes from using the applysoftware command and recommended actions (continued) The applysoftware Error Code command explanation Action EFSSG4102A The applysoftware command returned software package does not exist EFSSG4103 The software package is not The package might be valid.
  • Page 314: Upgrade Error Codes From Using The

    EFSSG4160 The system has insufficient At least 3 GB of space is file system space. required. Remove unneeded files from the /var file system. EFSSA0201C The license agreement has not been accepted. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 315 2. After each recommended fix, restart the upgrade by issuing the applysoftware command again. If the action fails, try the next recommended action. 3. If the recommended actions fail to resolve the issue, call the IBM Support Center. Table 44. Upgrade error codes and recommended actions...
  • Page 316: Upgrade Error Codes And Recommended Actions

    If there is no obvious event that could have caused this error, refer to “Ethernet connectivity from file modules to the control enclosure” on page 29. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 317 Table 44. Upgrade error codes and recommended actions (continued) Error Code Explanation Action 01B5 Storwize V7000 multipaths are Check the Fibre Channel connections to unhealthy. the system. Reseat Fibre Channel cables. For more information, see “Fibre Channel connectivity between file modules and control enclosure”...
  • Page 318 Unable to configure node. 1. Pull both power supply cables from subject node. Wait 10 seconds, then plug back in. After the system restarts, try again. 2. Contact your next level of support. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 319 Table 44. Upgrade error codes and recommended actions (continued) Error Code Explanation Action 01D0 Unable to disable call home. Contact your next level of support. 01D1 Unable to enable call home. Contact your next level of support. 01D2 Failed to stop GPFS. 1.
  • Page 320 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 321: Chapter 8. Troubleshooting Compressed File Systems

    Storage pool is full and the file system pool Increase capacity of the storage pool. is offline. Storage pool is full and the file system pool Contact IBM Remote Technical Support or is offline, but no additional storage is your service representative. available to add to the pool.
  • Page 322 2. Select the storage system to view a list of MDisks that are currently detected on the external storage system. If there are no MDisks that are displayed, click Detect MDisks. If theStorwize V7000 Unified system attached to external storage systems, you can allocate additional LUNs. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 323: Recovery Scenario: Overestimated Compression Ratio

    3. Right-click an unmanaged MDisk and select Add to Pool. 4. On the Add to Pool dialog, select the pool and click Add to Pool. 5. Verify that the MDisk was added to the selected pool by expanding the pool and ensuring that the added MDisk is displayed.
  • Page 324 5. On the right side of the panel, under the Capacity heading, the real capacity for the compressed volume is displayed. The storage pool must have at least the real capacity of the volume to successfully migrate the data. Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 325: Recovery Scenario: File System Offline

    To decrease the file system capacity, you can remove the disks (NSD) and the corresponding mapping to block volumes to force migration of the data to other NSDs, thus freeing up space on the file system. To remove an NSD, contact IBM Remote Technical Support.
  • Page 326 In the management GUI, select Files > File Systems. b. Right-click the compressed file system that is offline and select Mount. If the file system does not come back online you may need to restart all of the Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 327: Monitoring File System Compression

    Step 1 and select Mark as... > Spare . e. Click OK. To add additional drives to the system, complete these steps: a. Acquire additional drives from IBM or vendor. b. Install drives into available drive slots on the enclosure. See Installing a hot-swap hard disk drive.
  • Page 328 Additionally you must also monitor file capacity utilization to ensure that the file system does not reach 100% utilization and run out of capacity. The capacity utilization of a file system issued physical capacity in the compressed pool. The Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 329 system uses the same threshold and alerting system and suggests corrective actions when thresholds are reached. If based on the original, uncompressed capacity that the system presents to users and applications of the file system. To free up capacity in a file system, you can either delete files from the file system or increase the current capacity of the storage pool, which can be used to expand the volumes that are related to the NSDs from the unused physical capacity.
  • Page 330 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 331: Appendix. Accessibility Features For Ibm Storwize V7000 Unified

    Accessibility features help a user who has a physical disability, such as restricted mobility or limited vision, to use software products successfully. Accessibility features These are the major accessibility features associated with the Storwize V7000 Unified Information Center: v You can use screen-reader software and a digital speech synthesizer to hear what is displayed on the screen.
  • Page 332 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 333: Notices

    Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.
  • Page 334 IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created...
  • Page 335: Trademarks

    IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs.
  • Page 336: Industry Canada Compliance Statement

    Properly shielded and grounded cables and connectors must be used in order to meet FCC emission limits. IBM is not responsible for any radio or television interference caused by using other than recommended cables and connectors, or by unauthorized changes or modifications to this equipment.
  • Page 337: Germany Electromagnetic Compatibility Directive

    Klasse A ein. Um dieses sicherzustellen, sind die Geräte wie in den Handbüchern beschrieben zu installieren und zu betreiben. Des Weiteren dürfen auch nur von der IBM empfohlene Kabel angeschlossen werden. IBM übernimmt keine Verantwortung für die Einhaltung der Schutzanforderungen, wenn das Produkt ohne Zustimmung der IBM verändert bzw.
  • Page 338: Japan Vcci Council Class A Statement

    Statement International Electrotechnical Commission (IEC) statement This product has been designed and built to comply with (IEC) Standard 950. Korean Communications Commission (KCC) Class A Statement Russia Electromagnetic Interference (EMI) Class A Statement Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 339: Taiwan Class A Compliance Statement

    Fax: 0049 (0)711 785 1283 Email: mailto: tjahn @ de.ibm.com Taiwan Contact Information This topic contains the product service contact information for Taiwan. IBM Taiwan Product Service Contact Information: IBM Taiwan Corporation 3F, No 7, Song Ren Rd., Taipei Taiwan Tel: 0800-016-888...
  • Page 340 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 341: Index

    185 disability establishing 269 CLI commands accessibility xix, 307 apply software command 191 configuration documentation authenticating installation problems 20 improvement xxiii trouble 173 client access drive characteristics pinging 175 best practices 5 drive, hot-swap, installing 127 © Copyright IBM Corp.
  • Page 342 195 240 VA safety cover 163 events installing reporting 193 10-Gbps Ethernet PCI adapter 112 expansion enclosure DIMM 139 Germany electronic emission compliance detection error 209 Fibre Channel PCI adapter 112 statement 313 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 343 38 mirrored volumes light path diagnostics 41 not identical 209 system status 213 multipath events legal notices outputs 279 IBM virtual media key Notices 309 removal 103 trademarks 311 replacing 104 light path diagnostics identifying LEDs 41...
  • Page 344 206 operator information panel accessing 307 service assistant assembly 147, 148 accessing 184, 223 parts interface 183 overview 224 supported browsers 208 preparing 224 query status command 192 when to use 183 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 345 257 error codes 288 Storwize V7000 46 recovery 288 hardware indicators 46 USB flash drive Storwize V7000 Unified library detection error 210 related publications xx USB key superuser using 186 password when to use 186 resetting 211...
  • Page 346 Storwize V7000 Unified: Problem Determination Guide Version...
  • Page 348 Part Number: 00AR050 Printed in USA GA32-1057-07...

Table of Contents