HP  ProLight Server Troubleshooting Manual

HP ProLight Server Troubleshooting Manual

Hewlett-packard proliant servers troubleshooting guide
Table of Contents

Advertisement

HP ProLiant Servers

Troubleshooting Guide

June 2006 (Fifth Edition)
Part Number 375445-005

Advertisement

Table of Contents
loading

Summary of Contents for HP HP ProLight Server

  • Page 1: Troubleshooting Guide

    HP ProLiant Servers Troubleshooting Guide June 2006 (Fifth Edition) Part Number 375445-005...
  • Page 2 © Copyright 2004-2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
  • Page 3: Table Of Contents

    Contents Introduction ... 10 What's new... 10 Revision history ... 10 375445-xx4 (May 2006)... 10 375445-xx3 (September 2005) ... 10 Getting started... 11 Pre-diagnostic steps ... 12 Important safety information... 12 Symptom information ... 14 Prepare the server for diagnosis ... 14 Common problem resolution ...
  • Page 4 HP Systems Insight Manager ... 59 Management Agents... 59 HP ProLiant Essentials Virtualization Management Software ... 59 HP ProLiant Essentials Server Migration Pack - Physical to ProLiant Edition... 60 HP BladeSystem Essentials Insight Control Data Center Edition ... 60 HP Control Tower ... 60 System Management homepage...
  • Page 5 Survey Utility... 62 Integrated Management Log ... 62 Array Diagnostic Utility ... 63 Remote support and analysis tools ... 63 HP Instant Support Enterprise Edition... 63 Web-Based Enterprise Service... 63 Open Services Event Manager ... 63 Keeping the system current ... 63 Drivers ...
  • Page 6 Teardown procedures, part numbers, specifications ... 72 Technical topics... 72 Error messages ... 73 ADU error messages... 73 Introduction to ADU error messages ... 73 Accelerator Board not Detected... 73 Accelerator Error Log ... 73 Accelerator Parity Read Errors: X ... 73 Accelerator Parity Write Errors: X ...
  • Page 7 Drive Time-Out Occurred on Physical Drive Bay X... 80 Drive X Indicates Position Y... 80 Duplicate Write Memory Error ... 80 Error Occurred Reading RIS Copy from SCSI Port X Drive ID ... 80 FYI: Drive (Bay) X is Third-Party Supplied ... 80 Identify Logical Drive Data did not Match with NVRAM...
  • Page 8 Swapped Cables or Configuration Error Detected. A Drive Rearrangement..88 Swapped Cables or Configuration Error Detected. An Unsupported Drive Arrangement Was Attempted... 88 Swapped cables or configuration error detected. The cables appear to be interchanged..88 Swapped cables or configuration error detected. The configuration information on the attached drives..89 Swapped Cables or Configuration Error Detected.
  • Page 9 MSG_CPU_RR_15 ... 136 MSG_CPU_RR_16 ... 136 MSG_CPU_RR_17 ... 137 Contacting HP ... 138 Contacting HP technical support or an authorized reseller ... 138 Customer self repair... 138 Server information you need ... 139 Operating system information you need ... 139 Microsoft operating systems ...
  • Page 10: Introduction

    What's new... 10 Revision history ... 10 What's new The fifth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-xx5, includes the following additions: c-Class server blade power-on problems flowchart (on page 25) c-Class server blade POST problems flowchart (on page 28) c-Class server blade fault indications flowchart (on page 31) Windows®...
  • Page 11: Getting Started

    Use this section to locate a complete list of ADU error messages (on page 73), POST error messages and beep codes (on page 92), event list error messages (on page 124), HP BladeSystem infrastructure error codes (on page 127), and Port 85 codes and iLO messages (on page 131).
  • Page 12: Pre-Diagnostic Steps

    Pre-diagnostic steps WARNING: To avoid potential problems, ALWAYS read the warnings and cautionary information in the server documentation before removing, replacing, reseating, or modifying system components. IMPORTANT: This guide provides information for multiple servers. Some information may not apply to the server you are troubleshooting.
  • Page 13 Warnings and cautions WARNING: Only authorized technicians trained by HP should attempt to repair this equipment. All troubleshooting and repair procedures are detailed to allow only subassembly/module-level repair. Because of the complexity of the individual boards and subassemblies, no one should attempt to make repairs at the component level or to make modifications to any printed wiring board.
  • Page 14: Symptom Information

    NOTE: To verify the server configuration, connect to the System Management homepage (on page 61) and select Version Control Agent. The VCA gives you a list of names and versions of all installed HP drivers, Management Agents, and utilities, and whether they are up to date.
  • Page 15: Common Problem Resolution

    Updating firmware To update the system ROM or option firmware, use HP Smart Components. These components are available on the Firmware Maintenance CD and the HP website (http://www.hp.com/support). The most recent version of a particular server or option firmware is available on the following: HP Support website (http://www.hp.com/support)
  • Page 16: Hard Drive Guidelines

    Components for option firmware updates are also available from the HP Storage Products Software and Drivers website (http://www.hp.com/support/proliantstorage). Find the most recent version of the component that you require. Components for controller firmware updates are available in offline and online formats.
  • Page 17: Sas And Sata Hard Drive Led Combinations

    The drive is part of an array being selected by an array configuration utility Drive Identification has been selected in HP SIM The drive firmware is being updated The drive has been placed offline due to hard disk drive failure or subsystem communication failure.
  • Page 18 Online/activity LED Fault/UID LED (green) (amber/blue) Flashing irregularly Amber, flashing regularly (1 Hz) Flashing irregularly Off Steadily amber Amber, flashing regularly (1 Hz) Interpretation The drive is active, but a predictive failure alert has been received for this drive. Replace the drive as soon as possible. The drive is active, and it is operating normally.
  • Page 19: Diagnostic Flowcharts

    Troubleshooting flowcharts ... 19 Troubleshooting flowcharts To effectively troubleshoot a problem, HP recommends that you start with the first flowchart in this section, "Start diagnosis flowchart (on page 20)," and follow the appropriate diagnostic path. If the other flowcharts do not provide a troubleshooting solution, follow the diagnostic steps in "General diagnosis flowchart (on page 20)."...
  • Page 20: Start Diagnosis Flowchart

    Start diagnosis flowchart Use the following flowchart to start the diagnostic process. General diagnosis flowchart Diagnostic flowcharts 20...
  • Page 21 The General diagnosis flowchart provides a generic approach to troubleshooting. If you are unsure of the problem, or if the other flowcharts do not fix the problem, use the following flowchart. Diagnostic flowcharts 21...
  • Page 22: Power-On Problems Flowchart

    Power-on problems flowchart Server power-on problems flowchart Symptoms: The server does not power on. The system power LED is off or amber. The external health LED is red or amber. The internal health LED is red or amber. NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation. Possible causes: Improperly seated or faulty power supply Loose or faulty power cord...
  • Page 23 p-Class server blade power-on problems flowchart Symptoms: The server does not power on. The system power LED is off or amber. The health LED is red or amber. NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation. Possible causes: Improperly seated or faulty power supply Diagnostic flowcharts 23...
  • Page 24 Loose or faulty power cord Power source problem Power-on circuit problem Improperly seated component or interlock problem Faulty internal component Diagnostic flowcharts 24...
  • Page 25 c-Class server blade power-on problems flowchart Symptoms: The server does not power on. The system power LED is off or amber. The health LED is red or amber. NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation. Possible causes: Improperly seated or faulty power supply Loose or faulty power cord...
  • Page 26: Post Problems Flowchart

    POST problems flowchart Symptoms: Server does not complete POST NOTE: The server has completed POST when the system attempts to access the boot device. Server completes POST with errors Possible problems: Improperly seated or faulty internal component Faulty KVM device Faulty video device Diagnostic flowcharts 26...
  • Page 27 Server and p-Class server blade POST problems flowchart Diagnostic flowcharts 27...
  • Page 28: Operating System Boot Problems Flowchart

    c-Class server blade POST problems flowchart Operating system boot problems flowchart Symptoms: Server does not boot a previously installed OS Server does not boot SmartStart Possible causes: Corrupted OS Hard drive subsystem problem Incorrect boot order setting in RBSU There are two ways to use SmartStart when diagnosing OS boot problems on a server blade: Diagnostic flowcharts 28...
  • Page 29: Server Fault Indications Flowchart

    Use iLO to remotely attach virtual devices to mount the SmartStart CD onto the server blade. Use a local I/O cable and drive to connect to the server blade, and then restart the server blade. Server fault indications flowchart Symptoms: Server boots, but a fault event is reported by Insight Management Agents (on page 59) Server boots, but the internal health LED, external health LED, or component health LED is red or amber...
  • Page 30 NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation. Possible causes: Improperly seated or faulty internal or external component Unsupported component installed Redundancy failure System overtemperature condition Server and p-Class server blade fault indications flowchart Diagnostic flowcharts 30...
  • Page 31 c-Class server blade fault indications flowchart Diagnostic flowcharts 31...
  • Page 32: Hardware Problems

    Hardware problems In this section Procedures for all ProLiant servers ... 32 Power problems ... 32 General hardware problems... 33 Internal system problems ... 35 System open circuits and short circuits ... 43 External device problems ... 43 Procedures for all ProLiant servers The procedures in this section are comprehensive and include steps about or references to hardware features that may not be supported by the server you are troubleshooting.
  • Page 33: Ups Problems

    UPS problems UPS is not working properly Action: Be sure the UPS batteries are charged to the proper level for operation. See the UPS documentation for details. Be sure the UPS power switch is in the On position. See the UPS documentation for the location of the switch.
  • Page 34: Unknown Problem

    After you check the settings in RBSU, save and exit the utility, and then restart the server. For more information on RBSU, refer to the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.hp.com/servers/smartstart).
  • Page 35: Third-Party Device Problems

    If the system boots and video is working, add each component back to the server one at a time, restarting the server after each component is added to determine if that component is the cause of the problem. When adding each component back to the server, be sure to disconnect power to the server and follow the guidelines and cautionary information in the server documentation.
  • Page 36: Diskette Drive Problems

    Be sure no loose connections (on page 15) exist. Be sure the media from which you are attempting to boot is not damaged and is a bootable CD. If attempting to boot from a USB CD-ROM drive: Refer to the operating system and server documentation to be sure both support booting from a USB CD-ROM drive.
  • Page 37: Tape Drive Problems

    If the issue is resolved, it is not necessary to complete the remaining actions. All actions may not apply to all tape drives. For detailed tape drive troubleshooting information, see the HP website (http://www.hp.com/support/proliantstorage). To download HP StorageWorks Library and Tape Tools, see the HP website (http://www.hp.com/support/tapetools). Stuck tape issue Action: Manually press the Eject button.
  • Page 38 Run the Acceptance Test in HP StorageWorks Library and Tape Tools. CAUTION: Running the Acceptance Test overwrites the tape. To avoid overwriting the tape, run the shorter Device Analysis Test instead. Run the Media Validation Test in HP StorageWorks Library and Tape Tools. Backup issue Action: Run the Acceptance Test in HP StorageWorks Library and Tape Tools.
  • Page 39: Hard Drive Problems

    Action: Check the LEDs on the hard drive to be sure they indicate normal function. Refer to the server documentation or the HP website (http://www.hp.com) for information on hard drive LEDs. Be sure no loose connections (on page 15) exist.
  • Page 40: Fan Problems

    Be sure the files are not corrupt. Run the repair utility for the operating system. Be sure no viruses exist on the server. Run a current version of a virus scan utility. Server response time is slower than usual Action: Be sure the hard drive is not full, and increase the amount of free space on the hard drive, if needed.
  • Page 41: Memory Problems

    DIMMs. Then, isolate the failed DIMM by switching each DIMM in a bank with a known working DIMM. Remove any third-party memory. Run HP Insight Diagnostics (on page 61) to test the memory. Server is out of memory Action: Be sure the memory is configured properly. Refer to the application documentation to determine the memory configuration requirements.
  • Page 42: Ppm Problems

    Be sure the memory is the correct type for the server and is installed according to the server requirements. Refer to the server documentation or HP website (http://www.hp.com). Be sure you have not exceeded the memory limits of the server or operating system. Refer to the server documentation.
  • Page 43: System Open Circuits And Short Circuits

    Replace the remaining processor with a known functional processor. If the problem is resolved after you restart the server, a fault exists with one or more of the original processors. Install each processor and its associated PPM (if applicable) one by one, restarting each time, to find the faulty processor or processors.
  • Page 44: Mouse And Keyboard Problems

    Be sure a video expansion board, such as a RILOE board, has not been added to replace onboard video, making it seem like the video is not working. Disconnect the video cable from the onboard video, and then reconnect it to the video jack on the expansion board. NOTE: All servers automatically bypass onboard video when a video expansion board is present.
  • Page 45: Audio Problems

    Action: Be sure the correct printer drivers are installed. Local I/O cable problems NOTE: The local I/O cable is used only with HP ProLiant p-Class server blades. Action: If the local I/O cable does not have hot-plug functionality, be sure you are not using a PS/2 keyboard or mouse.
  • Page 46 Data is displayed as garbled characters after the connection is established Action: Be sure both modems have the same settings, including speed, data, parity, and stop bits. Be sure the software is set for the correct terminal emulation. Reconfigure the software correctly. Restart the server.
  • Page 47: Network Controller Problems

    Action: Check the network controller LEDs to see if any statuses indicate the source of the problem. For LED information, refer to the network controller documentation. ("HP Insight Diagnostics" on page 61) and replace failed components as Hardware problems 47...
  • Page 48 Refer to the operating system documentation to be sure that the driver parameters match the configuration of the network controller. Problems are occurring with the network interconnect blades Action: Be sure the network interconnect blades are properly seated and connected. ("HP Insight Diagnostics" on page 61) and replace failed components as Hardware problems 48...
  • Page 49: Software Problems

    Other useful resources include HP Insight Diagnostics (on page 61) and HP SIM Manager" on page 59). Use either utility to gather critical system hardware and software information and to help with problem diagnosis.
  • Page 50: Operating System Updates

    Hyper-Threading Linux distributions Additional information and the appropriate solutions for the Linux distributions (if any) can be found at the HP website (http://h18004.www1.hp.com/products/servers/linux/processor-notes.html). Operating system updates Use care when applying operating system updates (Service Packs, hotfixes, and patches). Before updating the operating system, read the release notes for each update.
  • Page 51: Restoring To A Backed-Up Version

    Install the current drivers. If you apply the update and have problems, refer to the Software and Drivers Download website (http://h18007.www1.hp.com/support/files/server) to find files to correct the problems. Restoring to a backed-up version If you recently upgraded the operating system or software and cannot resolve the problem, you can try restoring a previously saved version of the system.
  • Page 52: Linux Operating Systems

    Linux—Refer to the operating system documentation for information. Linux operating systems For troubleshooting information specific to Linux operating systems, refer to the Linux for ProLiant website (http://h18000.www1.hp.com/products/servers/linux). Application software problems Software locks up Action: Check the application log and operating system log for entries indicating why the software failed.
  • Page 53: Command-Line Syntax Error

    If the target system is not listed in the supported servers list, an error message is displayed and the program exits. Only supported systems can be upgraded using the Remote ROM Flash utility. To see if the server is supported, refer to the Software and Drivers Download website (http://h18007.www1.hp.com/support/files/server). Software problems 53...
  • Page 54: Software Tools And Solutions

    Servers running Microsoft® operating systems require Internet Explorer 5.5 (with Service Pack 1) or later. For Linux servers, refer to the README.TXT file for additional browser and support information. For more information, refer to the HP Array Configuration Utility User Guide on the Documentation CD or the HP website (http://www.hp.com).
  • Page 55: Smartstart Scripting Toolkit

    Selecting the primary boot controller Configuring memory options Language selection For more information on RBSU, refer to the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.hp.com/servers/smartstart). Using RBSU The first time you power up the server, the system prompts you to enter RBSU and select a language.
  • Page 56: Configuring Online Spare Memory

    RBSU by pressing the F9 key when prompted. After the settings are selected, exit RBSU and allow the server to reboot automatically. For more information, refer to the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.hp.com/servers/smartstart).
  • Page 57: Option Rom Configuration For Arrays

    It enables you to perform imaging or scripting functions and maintain software images. For more information about the RDP, refer to the HP ProLiant Essentials Rapid Deployment Pack CD or refer to the HP website (http://www.hp.com/servers/rdp).
  • Page 58: Management Cd

    ASR increases server availability by restarting the server within a specified time after a system hang or shutdown. At the same time, the HP SIM console notifies you by sending a message to a designated pager number that ASR has restarted the system. You can disable ASR from the HP SIM console or through RBSU.
  • Page 59: Erase Utility

    HP SIM provides device management capabilities that consolidate and integrate management data from HP and third-party devices. IMPORTANT: You must install and use HP SIM to benefit from the Pre-Failure Warranty for processors, SAS and SCSI hard drives, and memory modules.
  • Page 60: Hp Proliant Essentials Server Migration Pack - Physical To Proliant Edition

    HP Control Tower is an all-in-one software package that provides management and deployment for HP BladeSystem and its ProLiant BL p-Class server blades. Built on Linux, it delivers an easy-to-use interface tailored to blades and optimized for Linux users. HP Control Tower enables operating system deployment using both standard installation and image-based technologies.
  • Page 61: System Management Homepage

    (https://localhost:2381). USB support HP provides both standard USB support and legacy USB support. Standard support is provided by the operating system through the appropriate USB device drivers. HP provides support for USB devices before the operating system loads through legacy USB support, which is enabled by default in the system ROM.
  • Page 62: Survey Utility

    If a significant change occurs between data-gathering intervals, the Survey Utility marks the previous information and overwrites the Survey text files to reflect the latest changes in the configuration. Survey Utility is installed with every SmartStart-assisted installation or can be installed through the HP PSP ("ProLiant Support Packs"...
  • Page 63: Array Diagnostic Utility

    ISEE is a proactive remote monitoring and diagnostic tool to help manage your systems and devices, a feature of HP support. ISEE provides continuous hardware event monitoring and automated notification to identify and prevent potential critical problems. Through remote diagnostic scripts and vital system configuration information collected about your systems, ISEE enables fast restoration of your systems.
  • Page 64: Version Control

    Software and Drivers Download website (http://h18007.www1.hp.com/support/files/server). Change control and proactive notification HP offers Change Control and Proactive Notification to notify customers 30 to 60 days in advance of upcoming hardware and software changes on HP commercial products. For more information, refer to the HP website (http://h18023.www1.hp.com/solutions/pcsolutions/pcn.html).
  • Page 65: Care Pack

    Care Pack HP Care Pack Services offer upgraded service levels to extend and expand standard product warranty with easy-to-buy, easy-to-use support packages that help you make the most of your server investments. Refer to the Care Pack website (http://www.hp.com/hps/carepack/servers/cp_proliant.html). Firmware maintenance HP has developed technologies that ensure that HP servers provide maximum uptime with minimal maintenance.
  • Page 66: Methods For Updating Firmware

    For additional information, refer to the HP Online ROM Flash User Guide on the HP website (http://h18023.www1.hp.com/support/files/server/us/romflash.html). Option ROMs Smart Components for option ROMs provide for efficient administration of option ROM upgrades. Types of option ROMs include: Array controller ROMs...
  • Page 67: Current Firmware Versions

    NOTE: Option ROMPaqs have been retired as an upgrade delivery method for storage options. Firmware upgrades for storage options are now delivered using Smart Components and Smart Component deployment utilities. For additional information about the ROMPaq utility, refer to the server documentation or the HP website (http://www.hp.com/support/files). ROM Update Utility The ROM Update Utility is offline ROM flash technology.
  • Page 68 Verify the firmware update by checking the version of the current firmware. Software tools and solutions 68...
  • Page 69: Hp Resources For Troubleshooting

    Care Pack HP Care Pack Services offer upgraded service levels to extend and expand standard product warranty with easy-to-buy, easy-to-use support packages that help you make the most of your server investments. Refer to the Care Pack website (http://www.hp.com/hps/carepack/servers/cp_proliant.html).
  • Page 70: White Papers

    HP Technical Documentation website (http://www.docs.hp.com) Installation and configuration information for the server management system Refer to the HP Systems Insight Manager Installation and User Guide on the Management CD or the HP website (http://www.hp.com/go/hpsim). Installation and configuration information for the server setup software...
  • Page 71: Management Of The Server

    Operating system version support Refer to the operating system support matrix (http://www.hp.com/go/supportos). Overview of server features and installation instructions Refer to the server user guide on the Documentation CD or on the HP Business Support Center website (http://www.hp.com/go/bizsupport). Power capacity Refer to the power calculator on the HP Enterprise Configurator website (http://h30099.www3.hp.com/configurator/).
  • Page 72: Server And Option Specifications, Symbols, Installation Warnings, And Notices

    Refer to the server documentation and printed notices. Printed notices are available in the Reference Information pack. Server documentation is available in the following locations: Documentation CD that ships with the server HP Business Support Center website (http://www.hp.com/go/bizsupport) HP Technical Documentation website (http://www.docs.hp.com) Teardown procedures, part numbers, specifications...
  • Page 73: Adu Error Messages

    ADU error messages... 73 POST error messages and beep codes... 92 Event list error messages ... 124 HP BladeSystem infrastructure error codes... 127 Port 85 codes and iLO messages ... 131 Windows® Event Log processor error codes ... 134 Insight Diagnostics processor error codes ... 135...
  • Page 74: Accelerator Parity Write Errors: X

    Accelerator Parity Write Errors: X Description: Number of times that write memory parity errors were detected during transfers to memory on the array accelerator board. Action: If many parity errors occurred, you may need to replace the array accelerator board. Accelerator Status: Cache was Automatically Configured During Last Controller Reset Description: Cache board was replaced with one of a different size.
  • Page 75: Accelerator Status: Obsolete Data Detected

    Description: The number of cache lines experiencing excessive ECC errors has reached a preset limit. Therefore, the cache has been shut down. Action: Reseat the cache to the controller. If the problem persists, replace the cache. Accelerator Status: Obsolete Data Detected Description: During reset initialization, obsolete data was found in the cache due to the drives being moved and written to by another controller.
  • Page 76: Accelerator Status: Valid Data Found At Reset

    Accelerator Status: Valid Data Found at Reset Description: Valid data was found in posted-write memory at reinitialization. Data will be flushed to disk. Action: No error or data loss condition exists. No action is required. Accelerator Status: Warranty Alert Description: Catastrophic problem exists with array accelerator board. Refer to other messages on Diagnostics screen for exact meaning of this message.
  • Page 77: Cache Has Been Disabled; Likely Caused By A Loose Pin On One Of The Ram Chips

    Cache Has Been Disabled; Likely Caused By a Loose Pin on One of the RAM Chips Description: Cache has been disabled due to a large number of ECC errors detected while testing the cache during POST. This is probably caused by a loose pin on one of the RAM chips. Action: Try reseating the cache to the controller.
  • Page 78: Controller Reported Post Error. Error Code: X

    page 63) examines each physical drive and looks for drives that have been moved to a different drive bay. Action: Look for messages indicating which drives have been moved. If no messages are displayed and drive swapping did not occur, run ACU controller and run the server setup utility to configure NVRAM.
  • Page 79: Drive (Bay) X Is A Replacement Drive

    If the problem persists, power down the system and replace the cable. If the problem persists, power down the system and replace the drive. Drive (Bay) X is a Replacement Drive Description: This drive has been replaced. This message is displayed if a drive is replaced in a fault- tolerant logical volume.
  • Page 80: Drive Monitoring Features Are Unobtainable

    Description: An error occurred while ADU the RIS from this drive. Action: HP stores the hard drive configuration information in the RIS. If multiple errors occur, the drive may need to be replaced. FYI: Drive (Bay) X is Third-Party Supplied Description: Third-party supplied the installed drive.
  • Page 81: Identify Logical Drive Data Did Not Match With Nvram

    Identify Logical Drive Data did not Match with NVRAM Description: The identify unit data from the array controller does not match with the information stored in NVRAM. This can occur if new, previously configured drives have been placed in a system that has also been previously configured.
  • Page 82: Logical Drive X Status = Interim Recovery (Volume Functional, But Not Fault Tolerant)

    Action: Check for drive failures, wrong drive replaced, or loose cable messages. If a drive failure occurred, replace the failed drive or drives, and then restore the data for this logical drive from the tape backup. Otherwise, follow the procedures for correcting problems when an incorrect drive is replaced or a loose cable is detected.
  • Page 83: Logical Drive X Status = Wrong Drive Replaced

    Logical Drive X Status = Wrong Drive Replaced Description: A physical drive in this logical drive has failed. The incorrect drive was replaced. Action: Power down the server. Replace the drive that was incorrectly replaced. Replace the original drive that failed with a new drive. CAUTION: Do not run the server setup utility and try to reconfigure, or data will be lost.
  • Page 84: Other Controller Indicates Different Firmware Version

    Other Controller Indicates Different Firmware Version Description: The other controller in the redundant controller configuration is using a different firmware version. Action: Be sure both controllers are using the same firmware revision. Other Controller Indicates Different Cache Size Description: The other controller in the redundant controller configuration has a different size array accelerator.
  • Page 85: Scsi Port X Drive Id Y Failed - Replace (Failure Message)

    If the error persists after completing steps 1 through 4, contact an HP authorized service provider. SCSI Port X Drive ID Y Failed - REPLACE (failure message) Description: ADU ("Array Diagnostic Action: Correct the condition that caused the error, if possible, or replace the drive.
  • Page 86: Performance Data

    Description: A predictive failure warning for this hard drive has been generated, indicating that a drive failure is imminent. Action: Replace this drive at the earliest opportunity. Refer to the server documentation for drive replacement information before performing this operation. SCSI Port X, Drive ID Y...S.M.A.R.T.
  • Page 87: Storage Enclosure On Scsi Bus X Indicated An Overheated Condition

    Description: A power supply in the external storage unit has failed. Action: Replace the power supply. Storage Enclosure on SCSI Bus X Indicated an Overheated Condition..SOLUTION: Make sure all cooling fans are operating properly. Also be sure the operating environment of storage enclosure is within temperature specifications.
  • Page 88: Swapped Cables Or Configuration Error Detected. A Configured Array Of Drives

    Swapped cables or configuration error detected. A configured array of drives..was moved from another controller that supported more drives than this controller supports. SOLUTION: Upgrade the firmware on this controller. If this doesn’t solve the problem, then power down system and move the drives back to the original controller.
  • Page 89: Swapped Cables Or Configuration Error Detected. The Configuration Information On The Attached Drives

    If the problem persists, this might indicate a controller problem or a system board problem. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support before proceeding.
  • Page 90: This Controller Can See The Drives But The Other Controller Can't

    Unable to Retrieve Identify Controller Data. Controller May be Disabled or Failed ...SOLUTION: Power down the system. Verify that the controller is fully seated. Then power the system on and look for helpful error messages displayed by the controller. If this doesn’t help, contact your HP service provider. Description: ADU ("Array Diagnostic...
  • Page 91: Unsupported Processor Configuration (Processor Required In Slot #1)

    Unsupported Processor Configuration (Processor Required in Slot #1) Description: Processor required in slot 1. Action: If you do not install a supported processor in slot 1, this message is displayed, and the system halts. Warning Bit Detected Description: A monitor and performance threshold violation may have occurred. The status of a logical drive may not be OK.
  • Page 92: Warning: Storage Enclosure On Scsi Bus X Indicated It Is Operating In Single Ended Mode

    A server generates only the codes that are applicable to its configuration and options. HP ProLiant p-Class server blades do not have speakers and thus do not support audio output. Disregard the audible beeps information if the server falls into this category.
  • Page 93: Non-Numeric Messages Or Beeps Only

    Non-numeric messages or beeps only Advanced Memory Protection mode: Advanced ECC Audible Beeps: None Possible Cause: Advanced ECC support is enabled. Action: None. Advanced Memory Protection mode: Advanced ECC with hot-add support Audible Beeps: None Possible Cause: Advanced ECC with Hot-Add support is enabled. Action: None.
  • Page 94 Audible Beeps: None Possible Cause: The system experienced a critical error that caused an NMI. Action: Run Insight Diagnostics indicated. ("HP Insight Diagnostics" on page 61) and replace failed components as ("HP Insight Diagnostics" on page 61) and replace failed components as ("HP Insight...
  • Page 95 Possible Cause: A processor has experienced an internal error. Action: Run Insight Diagnostics components as indicated, including processors and PPMs. ("HP Insight Diagnostics" on page 61) and replace failed components as ("HP Insight Diagnostics" on page 61) and replace any failed ("HP Insight...
  • Page 96 Invalid memory types were found on the same node. Please check DIMM compatibility. - Some DIMMs may not be used Description: Invalid or mixed memory types were detected during POST. Action: Use only supported DIMM pairs when populating memory sockets. Refer to the applicable server user guide memory requirements.
  • Page 97 Possible Cause: A PCI device has generated a parity error on the PCI bus. Action: For plug-in PCI cards, remove the card. For embedded PCI devices, run Insight Diagnostics and replace any failed components as indicated. ("HP Insight Diagnostics" on page 61) to identify failed DIMMs. Then, Error messages 97...
  • Page 98: Power Faults/Supply Solutions

    Action: Run ROMPaq Utility ("SoftPaqs" on page 64) to flash the system so that the primary and backup ROMs are valid. REDUNDANT ROM ERROR: Bootblock Invalid. - ..contact HP Representative. Audible Beeps: None Possible Cause: ROM bootblock is corrupt.
  • Page 99 Temperature violation detected - system Shutting Down in x seconds Audible Beeps: 1 long, 1 short Possible Cause: The system has reached a cautionary temperature level and is shutting down in X seconds. Action: Adjust the ambient temperature, install fans, or replace any failed fans. There must be a first DIMM in pair if second DIMM in pair is populated.
  • Page 100 The system will run in Full Performance mode. Audible Beeps: None Possible Cause: The system is configured for HP Static Low mode and the current processor cannot support this mode. Action: For more information about the Power Regulator for ProLiant option, refer to the HP ROM-Based Setup Utility User Guide on the Documentation CD or the HP website (http://www.hp.com/servers/smartstart).
  • Page 101: 100 Series

    Audible Beeps: None Possible Cause: 8237 DMA controllers, 8254 timers, and similar devices. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support before proceeding.
  • Page 102 Audible Beeps: None Possible Cause: 8237 DMA controllers, 8254 timers, and similar devices. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support before proceeding.
  • Page 103: Series

    Audible Beeps: 1 long, 1 short Possible Cause: Installed memory module is an unsupported size. Action: Install a memory module of a supported size. ("HP Insight Diagnostics" on page 61) and replace failed components as ("HP Insight Diagnostics" on page 61) and replace failed components as...
  • Page 104 207-Invalid Memory Configuration - Incomplete Bank Detected in Bank X Audible Beeps: 1 long, 1 short Possible Cause: Bank is missing one or more DIMMs. Action: Fully populate the memory bank. 207-Invalid Memory Configuration - Insufficient Timings on DIMM Audible Beeps: 1 long, 1 short Possible Cause: The installed memory module is not supported.
  • Page 105 207-Invalid Memory Configuration - Unsupported DIMM in Socket X Audible Beeps: 1 long, 1 short Possible Cause: Unregistered DIMMs or insufficient DIMM timings. Action: Install registered ECC DIMMs. 207-Memory Configuration Warning - DIMM In Socket x does not have Primary Width of 4 and only supports standard ECC Advanced ECC does not function when mixing DIMMs with Primary Widths of x4 and x8.
  • Page 106: 300 Series

    Power down the server, and then reconnect the keyboard. Be sure no keys are depressed or stuck. If the failure reoccurs, replace the keyboard. ("HP Insight Diagnostics" on page 61) and replace failed components as ("HP Insight Diagnostics" on page 61) and replace failed components as...
  • Page 107: 400 Series

    Action: Be sure the keyboard and mouse are connected. CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support before proceeding. Run Insight Diagnostics indicated.
  • Page 108: 600 Series

    Action: Run the server setup utility to configure the diskette drive port address and manually resolve the conflict. ("HP Insight Diagnostics" on page 61) and replace failed components as ("HP Insight Diagnostics" on page 61) and replace failed components as...
  • Page 109: 1100 Series

    Be sure the assembly is properly connected and each fan is properly seated. If the problem persists, replace the failed fans. If a known working replacement fan is not spinning, replace the assembly. ("HP Insight Diagnostics" on page 61) and replace failed components as Error messages 109...
  • Page 110 1611-CPU Zone Fan Assembly Failure Detected. Single fan..failure. Assembly will provide adequate cooling. Audible Beeps: None Possible Cause: Required fan is not spinning. Action: Replace the failed fan to provide redundancy, if applicable. 1611-Fan Failure Detected Audible Beeps: 2 short Possible Cause: Required fan is not installed or spinning.
  • Page 111 1611-Fan x Not Present (Fan Zone I/O) Audible Beeps: 2 short Possible Cause: Required fan is not installed or spinning. Action: Check the fans to be sure they are working. Be sure each fan cable is properly connected, if applicable, and each fan is properly seated. If the problem persists, replace the failed fans.
  • Page 112: 1700 Series

    1615-Power Supply Configuration Error Audible Beeps: None Possible Cause: The server configuration requires an additional power supply. A moving bar is displayed, indicating that the system is waiting for another power supply to be installed. Action: Install the additional power supply. 1615-Power Supply Configuration Error - A working power supply must be installed in Bay 1 for proper cooling.
  • Page 113 Audible Beeps: None Possible Cause: Upgrade the Array Accelerator module to a larger size. Action: Migrate logical drives to RAID 0 or 1, reduce the number of drives in the array, or upgrade to a larger-size array accelerator module. 1713-Slot Z Drive Array Controller - Redundant ROM Reprogramming Failure..Replace the controller if this error persists after restarting system.
  • Page 114 1720-S.M.A.R.T. Hard Drive Detects Imminent Failure Audible Beeps: None Possible Cause: A hard drive SMART predictive failure condition is detected. It may fail at some time in the future. Action: If configured as a non-RAID 0 array, replace the failing or failed drive. Refer to the server documentation.
  • Page 115 1727-Slot X Drive Array - New Logical Drive(s) Attachment Detected..If more than 32 logical drives, this message will be followed by: “Auto-configuration failed: Too many logical drives.” Audible Beeps: None Possible Cause: The controller has detected an additional array of drives that was connected when the power was off.
  • Page 116 1770-Slot X Drive Array - SCSI Drive Firmware Update Recommended - ..Please upgrade firmware on the following drive(s) using ROM Flash Components (download from www.hp.com/support/proliantstorage): Model XYZ (minimum version = ####) Audible Beeps: None Possible Cause: Drive firmware update needed.
  • Page 117 1774-Slot X Drive Array - Obsolete Data Found in Array Accelerator Audible Beeps: None Possible Cause: Drives were used on another controller and reconnected to the original controller while data was in the original controller cache. Data found in the array accelerator is older than data found on the drives and has been automatically discarded.
  • Page 118 Update the integrated Smart Array option to the latest firmware version page 65). CAUTION: Only authorized technicians trained by HP should attempt to remove the I/O board. If you believe the I/O board requires replacement, contact HP Technical Support before proceeding.
  • Page 119: 1779-Slot-X Drive Array Replacement Drive Detected

    1779-Slot X Drive Array - Replacement drive(s) detected OR previously failed drive(s) now operational:..Port Y: SCSI ID Z: Restore data from backup if replacement drive X has been installed. Audible Beeps: None Possible Cause: More drives failed (or were replaced) than the fault-tolerance level allows. Unable to rebuild array.
  • Page 120: 1785-Slot X Drive Array Not Configured

    Be sure all drives are fully seated. Replace defective cables, drive X, or both. 1785-Slot X Drive Array Not Configured... (followed by one of the following): ...(1) Run Array Configuration Utility (2) No drives detected (3) Drive positions appear to have changed – Run Drive Array Advanced Diagnostics if previous positions are unknown.
  • Page 121: 1786-Slot 1 Drive Array Recovery Needed

    1786-Slot 1 Drive Array Recovery Needed. Automatic Data Recovery Previously Aborted!..The following SCSI drive(s) need Automatic Data Recovery: SCSI Port Y: SCSI ID Z Select F1 to retry Automatic Data Recovery to drive. Select F2 to continue without starting Automatic Data Recovery. Audible Beeps: None Possible Cause: System is in Interim Data Recovery Mode and a failed or replacement drive has not yet been rebuilt.
  • Page 122 Repair the connection and press the F2 key. If the problem persists, run ADU Be sure the cable is routed properly. 1789-Slot X Drive Array SCSI Drive(s) Not Responding..Check cables or replace the following SCSI drives: SCSI Port Y: SCSI ID Z Select F1 to continue –...
  • Page 123 1794-Drive Array - Array Accelerator Battery Charge Low..Array Accelerator is temporarily disabled. Array Accelerator will be re-enabled when battery reaches full charge. Audible Beeps: None Possible Cause: The battery charge is below 75 percent. Posted writes are disabled. Action: Replace the array accelerator board if the batteries do not recharge within 36 powered-on hours.
  • Page 124: Event List Error Messages

    ASR Lockup Detected: Cause Event Type: System lockup Action: Examine the IML to determine the cause of the lockup, and then refer to the HP ROM-Based Setup Utility User Guide, on the server Documentation CD or at the SmartStart website (http://www.hp.com/servers/smartstart), for more information.
  • Page 125: Automatic Operating System Shutdown Initiated Due To Fan Failure

    Automatic operating system shutdown initiated due to fan failure Event Type: Fan failure Action: Replace the fan. Automatic Operating System Shutdown Initiated Due to Overheat Condition..Fatal Exception (Number X, Cause) Event Type: Overheating condition Action: Check fans. Also, be sure the server is properly ventilated and the room temperature is set within the required range.
  • Page 126: Unrecoverable Host Bus Data Parity Error

    Real-Time Clock Battery Failing Event Type: System configuration battery low Action: Replace the system configuration battery. System AC Power Overload (Power Supply X) Event Type: Power supply overload Action: Switch the voltage from 110 V to 220 V or add an additional power supply (if applicable to the system).
  • Page 127: Hp Bladesystem Infrastructure Error Codes

    Event Type: Host bus error CAUTION: Only authorized technicians trained by HP should attempt to remove the system board. If you believe the system board requires replacement, contact HP Technical Support before proceeding. Action: Replace the board on which the processor is installed.
  • Page 128 Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. Press the server blade management module reset button. Replace the signal backplane. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Server blade management module power backplane A error codes...
  • Page 129 Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. Press the server blade management module reset button. Reseat the interconnect device. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Replace the interconnect device.
  • Page 130: Power Management Module Error Codes

    For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Replace the interconnect module. For more information, refer to the HP BladeSystem Maintenance and Service Guide on the HP website (http://www.hp.com/products/servers/proliant-bl/p-class/info). Interconnect Module B (6-Connector) Error Code...
  • Page 131: Port 85 Codes And Ilo Messages

    Power management module board error codes LED code: 7-1, 7-2, 7-3, 7-4, 7-5, 7-6, 7-7, 7-8, 7-9, 7-10, 7-11, 7-12, or 7-13 Location: Power management board Action: Perform the following steps to resolve the problem. Stop when the problem is resolved. Reseat the power management module.
  • Page 132: Memory-Related Port 85 Codes

    IMPORTANT: Reboot the server after completing each numbered step. If the error condition continues, proceed with the next step. To troubleshoot processor-related error codes: Bring the server to base configuration by removing all components that are not required by the server to complete POST.
  • Page 133: Expansion Board-Related Port 85 Codes

    Reseat the remaining memory boards, rebooting after each installation to isolate any failed memory boards, if applicable. Replace the DIMMs with a remaining bank of memory. Replace the memory board, if applicable. Replace the system board. IMPORTANT: If replacing the system board or clearing NVRAM, you must re-enter the server serial number through RBSU ("Re-entering the server serial number and product Expansion board-related port 85 codes...
  • Page 134: Message Id: 4137

    IMPORTANT: Processor socket 1 and PPM slot 1 must be populated at all times or the server does not function properly. PPMs, except the PPM installed in slot 1 DIMMs, except the first bank Hard drives Peripheral devices Install each remaining system component, rebooting between each installation to isolate any failed components.
  • Page 135: Insight Diagnostics Processor Error Codes

    Description: The system encountered an NMI prior to this boot. The NMI source was: Uncorrectable cache memory error. Action: Replace the processor. Insight Diagnostics processor error codes MSG_CPU_RR_1 Event type: Unable to divide and multiply with zero and infinity. Action: Ensure proper ventilation and cooling for the server.
  • Page 136: Msg_Cpu_Rr_7

    MSG_CPU_RR_7 Event type: CPU speed is out of range. Action: Replace the processor. MSG_CPU_RR_8 Event type: Unable to update the CMOS time. Action: Replace the board that CMOS is on. MSG_CPU_RR_9 Event type: MMX hardware is not present. Action: Replace the processor. MSG_CPU_RR_10 Event type: MMX add instruction has failed.
  • Page 137: Msg_Cpu_Rr_17

    Action: Replace the processor. MSG_CPU_RR_17 Event type: Stress integer math test has failed. Action: Ensure proper ventilation and cooling for the server. Ensure the processor heatsinks are attached correctly (do not remove them). Check diagnostics and the Integrated Management Log for heat-related events. Upgrade to the latest versions of system BIOS and Insight Diagnostics.
  • Page 138: Contacting Hp

    HP's customer self-repair program offers you the fastest service under either warranty or contract. It enables HP to ship replacement parts directly to you so that you can replace them. Using this program, you can replace parts at your own convenience.
  • Page 139: Server Information You Need

    Server information you need Before contacting HP technical support, collect the following information: Explanation of the issue, the first occurrence, and frequency Any changes in hardware or software configuration before the issue surfaced Third-party hardware information: Product name, model, and version...
  • Page 140: Linux Operating Systems

    An updated Emergency Repair Diskette If HP drivers are installed: Version of the PSP used List of drivers from the PSP The drive subsystem and file system information: Number and size of partitions and logical drives File system on each logical drive Current level of Microsoft®...
  • Page 141: Novell Netware Operating Systems

    A list of the drivers and NLM files used on the server, including the names, versions, dates, and sizes (can be taken directly from the CONFIG.TXT or SURVEY.TXT files) If HP drivers are installed: Version of the PSP used List of drivers from the PSP Printouts or electronic copies (to e-mail to a support technician) of: SYS:SYSTEM\SYS$LOG.ERR...
  • Page 142: Ibm Os/2 Operating Systems

    Operating system version number Type of installation selected: Interactive, WebStart, or Customer JumpStart Which software group selected for installation: End User Support, Entire Distribution, Developer System Support, or Core System Support If HP drivers are installed with a DU: Contacting HP 142...
  • Page 143 A list of all third-party hardware and software installed, with versions A detailed description of the problem and any associated error messages Printouts or electronic copies (to e-mail to a support technician) of: /usr/sbin/crash (accesses the crash dump image at /var/crash/$hostname) /var/adm/messages /etc/vfstab /usr/sbin/prtconf Contacting HP 143...
  • Page 144: Acronyms And Abbreviations

    Acronyms and abbreviations ACPI Advanced Configuration and Power Interface Array Configuration Utility Advanced Data Guarding (also known as RAID 6) Array Diagnostics Utility CCITT International Telegraph and Telephone Consultative Committee cable select direct memory access driver update Extended Feature Supplement EULA end user license agreement Fibre Channel...
  • Page 145 integrated device electronics Integrated Lights-Out Integrated Management Display Integrated Management Log Internet Protocol ISEE Instant Support Enterprise Edition Internet service provider keyboard, video, and mouse light-emitting diode low-voltage differential multimedia extensions non-maskable interrupt NVRAM non-volatile memory OBDR One Button Disaster Recovery Acronyms and abbreviations 145...
  • Page 146 ORCA Option ROM Configuration for Arrays operating system POST Power-On Self Test processor power module ProLiant Support Pack RBSU ROM-Based Setup Utility RILOE Remote Insight Lights-Out Edition RILOE II Remote Insight Lights-Out Edition II reserve information sector read-only memory serial attached SCSI SATA serial ATA Systems Insight Manager...
  • Page 147 SMART self-monitoring analysis and reporting technology SNMP Simple Network Management Protocol support software diskette uninterruptible power system universal serial bus Version Control Agent VCRM Version Control Repository Manager Acronyms and abbreviations 147...
  • Page 148: Index

    CSR (customer self repair) 138 customer self repair (CSR) 138 data loss 36 data recovery 36, 39 deployment software 57 diagnose tab, HP Insight Diagnostics 62 diagnosing problems 62 diagnostic tools 54, 57, 58, 61, 62 diagnostics utility 61 dial tone 45...
  • Page 149 HP BladeSystem infrastructure error codes 127 HP Enterprise Configurator 71 HP Insight Diagnostics 61, 62, 124 HP ProLiant Essentials Foundation Pack 59, 70, 71 HP ProLiant Essentials Rapid Deployment Pack 57 HP Systems Insight Manager, overview 59, 70, 71 HP troubleshooting resources 69...
  • Page 150 22 PPM (Processor Power Module) 42 PPM failure LEDs 16, 42 PPM problems 42, 106 PPM slots 42 printer problems 45 printers 45 processor error codes 134, 135 processor failure LEDs 42 processor problems 42, 96, 106...
  • Page 151 read/write errors 36, 37 read/write issue, tape drive 38 redundant ROM 65, 98, 113 registering the server 71 Remote Insight Lights-Out Edition II (RILOE II) 50, 58 remote ROM flash 52, 53 remote ROM flash problems 52 remote support and analysis tools 63 required information 139 Resource Paqs 64 resources 69...
  • Page 152 13, 72 Web-Based Enterprise Service 63 website, HP 69, 70 white papers 70, 72 Windows Event Log processor error codes 134 Index 152...

Table of Contents