Bull Escala BL460 Problem Determination And Service Manual

Table of Contents

Advertisement

Quick Links

Escala BL460
Problem Determination and
Service Guide
REFERENCE
86 A7 81FB 00

Advertisement

Table of Contents
loading

Summary of Contents for Bull Escala BL460

  • Page 1 Escala BL460 Problem Determination and Service Guide REFERENCE 86 A7 81FB 00...
  • Page 3 ESCALA Blade Escala BL460 Problem Determination and Service Guide Hardware October 2009 BULL CEDOC 357 AVENUE PATTON B.P.20845 49008 ANGERS CEDEX 01 FRANCE REFERENCE 86 A7 81FB 00...
  • Page 4 Quoting of brand and product names is for information purposes only and does not represent trademark misuse. The information in this document is subject to change without notice. Bull will not be liable for errors contained herein, or for incidental or consequential damages in connection with the use of this material.
  • Page 5: Table Of Contents

    Table of Contents List of Figures......................iv List of Tables ......................v Safety ........................vii Safety statements.......................... viii Guidelines for trained service technicians..................xiv Inspecting for unsafe conditions ..................... xiv Guidelines for servicing electrical equipment ..................xv Chapter 1. Introduction ..................1 Related documentation......................
  • Page 6 Verifying the system firmware levels ................ 187 2.13.5 Committing the TEMP system firmware image ............188 2.14 Solving shared Bull Blade Chassis – Enterprise resource problems........188 2.14.1 Solving shared keyboard problems ................. 189 2.14.2 Solving shared media tray problems................ 190 2.14.3...
  • Page 7 Handling static-sensitive devices ................202 4.1.4 Returning a device or component................203 Removing the blade server from a Bull Blade Chassis - Enterprise......... 203 Installing the blade server in a Bull Blade Chassis - Enterprise..........204 Removing and replacing Tier 1 CRUs ................206 4.4.1...
  • Page 8: List Of Figures

    Light path diagnostic LEDs................... 183 Figure 3-1. Parts illustration ......................197 Figure 4-1. Removing the blade server from the Bull Blade Chassis - Enterprise ........203 Figure 4-2. Installing the blade server in a Bull Blade Chassis - Enterprise ......... 204 Figure 4-3.
  • Page 9 List of Tables Table 1-1. Memory module combinations ..................5 Table 1-2. Connectors description....................12 Table 1-3. System-board LEDs locations ..................14 Table 2-1. Location code ......................18 Table 2-2. Nine-word system reference code in the management-module event log......20 Table 2-3.
  • Page 11: Safety

    Safety Preface...
  • Page 12: Safety Statements

    English-language caution or danger statement with translated versions of the caution or danger statement in the Bull Safety Attention document. For example, if a caution statement begins with a number 1, translations for that caution statement appear in the Bull Safety Attention document under statement 1.
  • Page 13 Preface...
  • Page 14 Escala BL460 - Problem Determination and Service Guide...
  • Page 15 Preface...
  • Page 16 Escala BL460 - Problem Determination and Service Guide...
  • Page 17 xiii Preface...
  • Page 18: Guidelines For Trained Service Technicians

    Guidelines for trained service technicians Inspecting for unsafe conditions Escala BL460 - Problem Determination and Service Guide...
  • Page 19: Guidelines For Servicing Electrical Equipment

    Guidelines for servicing electrical equipment Preface...
  • Page 21: Chapter 1. Introduction

    • Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request Bull •...
  • Page 22: Notices And Statements In This Documentation

    The blade server might have features that are not described in the documentation that comes with the blade server. Review the Planning Guide and the Installation Guide for your Bull Blade Chassis - Enterprise. The information can help you prepare for system installation and configuration.
  • Page 23: Features And Specifications

    Features and specifications Features and specifications of the Escala BL460 blade server are summarized in this overview. The Escala BL460 blade server is used in a Bull Blade Chassis - Enterprise. Notes • Power, cooling, removable-media drives, external ports, and advanced system management are provided by the Bull Blade Chassis - Enterprise.
  • Page 24 Manager and Virtual I/O Server Partition Migration Reliability and service features: Dual alternating current power supply, • Bull Blade Chassis - Enterprise redundant and hot plug power and cooling modules, • Predictive Failure Analysis (PFA) alerts for the microprocessor and memory, •...
  • Page 25: Supported Dimms

    Supported DIMMs The planar in the Escala BL460 blade server contains eight very low profile (VLP) memory connectors for registered dual inline memory modules (RDIMMs). The total memory capacity ranges from a minimum of 4 GB to a maximum of 64 GB for a BL460 blade server.
  • Page 26: Figure 1-1. Dimm Connectors

    Figure 1-1. DIMM connectors Escala BL460 - Problem Determination and Service Guide...
  • Page 27: Blade Server Control Panel Buttons And Leds

    Figure 1-2. Blade server control panel buttons and LEDs Keyboard/video select button: When you use an operating system that supports a local console and keyboard, press this button to associate the shared Bull Blade Chassis - Enterprise keyboard and video ports with the blade server. Notes The operating system in the blade server must provide USB support for the •...
  • Page 28 Media-tray select button: Press this button to associate the shared Bull Blade Chassis - Enterprise media tray (removable-media drives and front-panel USB ports) with the blade server. The LED on the button flashes while the request is being processed, then is lit when the ownership of the media tray has been transferred to the blade server.
  • Page 29 Lit continuously: The blade server has power and is turned on. Note The enhanced service processor can take as long as three minutes to initialize after your install the Escala BL460 blade server, at which point the LED begins to flash slowly. Activity LED: When this green LED is lit, it indicates that there is activity on the hard disk drive or network.
  • Page 30: Turning On The Blade Server

    Turning on the blade server After you connect the blade server to power through the Bull Blade Chassis - Enterprise, you can start the blade server after the discovery and initialization process is complete. You can start the blade server in any of the following ways.
  • Page 31: Turning Off The Blade Server

    Turning off the blade server When you turn off the blade server, it is still connected to power through the Bull Blade Chassis - Enterprise. The blade server can respond to requests from the service processor, such as a remote request to turn on the blade server. To remove all power from the blade server, you must remove it from the Bull Blade Chassis - Enterprise.
  • Page 32: System-Board Layouts

    CFFv, or CIOv (1Xe) expansion card connector (Px-C12). SFFh, or CFFh (PCIe) high-speed expansion card connector (Px-C11). DIMM 5-8 connectors (see Figure 1-5 for individual connectors). Management card connector (P1-C9). 3V lithium battery connector (P1-E1) Table 1-2. Connectors description Escala BL460 - Problem Determination and Service Guide...
  • Page 33: Figure 1-4. Dimm Connectors

    Figure 1-4 shows individual DIMM connectors. Figure 1-4. DIMM connectors Chapter 1. Introduction...
  • Page 34: System-Board Leds

    Use the illustration of the LEDs on the system board to identify a light emitting diode (LED). Remove the blade server from the Bull Blade Chassis - Enterprise, open the cover to see any error LEDs that were turned on during error processing, and use the following figure to identify the failing component.
  • Page 35: Chapter 2. Diagnostics

    Chapter 2. Diagnostics Use the available diagnostic tools to help solve any problems that might occur in the blade server. The first and most crucial component of a solid serviceability strategy is the ability to accurately and effectively detect errors when they occur. While not all errors are a threat to system availability, those that go undetected are dangerous because the system does not have the opportunity to evaluate and act if necessary.
  • Page 36: Diagnostic Tools

    Use the light path diagnostic LEDs on the system board to identify failing hardware. If the system error LED on the system LED panel on the front or rear of the Bull Blade Chassis - Enterprise is lit, one or more error LEDs on the Bull Blade Chassis - Enterprise components also might be lit.
  • Page 37: Collecting Dump Data

    If you power off the blade through the management module while the service processor is performing a dump, platform dump data is lost. You might be asked to retrieve a dump to send it to Bull Support for analysis. The location of the dump data varies per operating system platform.
  • Page 38: Location Codes

    See “System-board connectors” on page 12 for component locations. Notes Location codes do not indicate the location of the blade server within the Bull Blade • Chassis - Enterprise. The codes identify components of the blade server only.
  • Page 39: Reference Codes

    Viewing the codes The Escala BL460 blade server does not display checkpoints or error codes on the remote console. The shared Bull Blade Chassis - Enterprise video also does not display the codes.
  • Page 40: System Reference Codes (Srcs)

    Select Blade Service Data → blade_name in the management module to see a list of the 32 most recent SRCs. Table 2-3. Management module reference code listing Unique ID System Reference Code Timestamp 00040001 D1513901 2005-11-13 19:30:20 00000016 D1513801 2005-11-13 19:30:16 Escala BL460 - Problem Determination and Service Guide...
  • Page 41 Any message with more detail is highlighted as a link in the System Reference Code column. Click the message to cause the management module to present the additional message detail: D1513901 Created 2007-11-13 19:30:20 Version: 0x02 Words 2-5: 020110F0 52298910 C1472000 200000FF SRC formats...
  • Page 42: 1Xxxyyyy Srcs

    Replacing the Tier 2 system-board and chassis as described in “ assembly ” on page 229. Service The DTRCARD Symbolic CRU isolation procedure is in “ processor problems ”, on page 169. Escala BL460 - Problem Determination and Service Guide...
  • Page 43 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 44 ” on page 229. 1. Check the management-module event log for entries that were made around the time that the Escala BL460 blade server shut down. The Bull Blade Chassis - 2. Resolve any problems that are found. Enterprise encountered a...
  • Page 45 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 46: 6Xxxyyyy Srcs

    Refer to the hosting partition for problem analysis. 632CC302 Media or device error occurred. Refer to the hosting partition for problem analysis. 632CC303 Media has an unknown format. No corrective action is required. Escala BL460 - Problem Determination and Service Guide...
  • Page 47: A1Xxyyyy Service Processor Srcs

    • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 48: A700Yyyy Licensed Internal Code Srcs

    Go to “Checkout procedure” on page 148. b. Replace the system-board and chassis assembly, as described Replacing the Tier 2 system-board and chassis in “ assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 49 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which •...
  • Page 50: B181Xxxx Service Processor Early Termination Srcs

    7209 processor to attempt a boot from the other firmware image in the service processor flash memory Power-off reset occurred. FipsDump 720A should be analyzed: Possible software problem Escala BL460 - Problem Determination and Service Guide...
  • Page 51: Table 2-11. B200Xxxx Logical Partition Srcs

    2.4.1.8 B200xxxx Logical partition SRCs A B200xxxx system reference code (SRC) is an error code that is related to logical partitioning. Table 2-11 describes error codes that might be displayed if POST detects a problem. The description also includes suggested actions to correct the problem. Note For problems persisting after completing the suggested actions, see “Checkout procedure”...
  • Page 52 Main Storage Dump. A mainstore dump 1280 startup did not complete due to a configuration mismatch. A partition memory error occurred. The Restart the partition. 1281 failed memory will no longer be used. Escala BL460 - Problem Determination and Service Guide...
  • Page 53 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 54 185. A problem occurred during the startup of a Go to “Firmware problem isolation” on page 185. partition. There was a partition main 5109 storage dump problem. The startup will not continue. Escala BL460 - Problem Determination and Service Guide...
  • Page 55 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 56 During the startup of a partition, an error Check for server firmware updates; then, install the updates 8113 occurred while mapping memory for the if available. partition startup. Escala BL460 - Problem Determination and Service Guide...
  • Page 57 8152 No active system processors. Verify that processor resources are assigned to the partition. A problem occurred during the migration of Contact Bull support, as described in Appendix A, “Getting 8160 a partition. Help and Technical Assistance”, on page 239.
  • Page 58: B700Xxxx Licensed Internal Code Srcs

    Table 2-12 describes the error codes that may be displayed if POST detects a problem. Suggested actions to correct the problem are also described. Note For problems persisting after completing the suggested actions, see “Checkout procedure” on page 148 and “Solving undetermined problems” on page 194. Escala BL460 - Problem Determination and Service Guide...
  • Page 59: Table 2-12. B700Xxxx Licensed Internal Code Srcs

    Continue running the system normally. At the earliest System firmware has experienced a low 0200 convenient time or service window, work with Bull Support storage condition to collect a platform dump and restart the system; then, go to “Firmware problem isolation” on page 185.
  • Page 60 Look for and correct B1xxxxxx errors. If there are no serviceable B1xxxxxx errors, or if correcting the errors does not correct the problem, contact Bull support to reset the server firmware settings. Attention: Resetting the server firmware settings results in the loss of all of the partition data that is stored on the System firmware failure.
  • Page 61 Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 62 B700 xxxx Description Action Error codes An unsupported Preferred Operating Work with Bull support to select a supported Preferred System was detected. Operating System; then, re-IPL the system. 5302 The Preferred Operating System specified is not supported. The IPL will not continue.
  • Page 63 Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 64: Ba000010 To Ba400002 Partition Firmware Srcs

    POST detects a problem. The description also includes suggested actions to correct the problem. Note For problems persisting after completing the suggested actions, see “Checkout procedure” on page 148 and “Solving undetermined problems” on page 194. Escala BL460 - Problem Determination and Service Guide...
  • Page 65: Table 2-13. Ba000010 To Ba400002 Partition Firmware Srcs

    Table 2-13. BA000010 to BA400002 Partition firmware SRCs Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. •...
  • Page 66 1. Go to “Checkout procedure” on page 148. 2. Replace the system-board and chassis assembly, as BA00E840 Failure when initializing PCI hot-plug Replacing the Tier 2 system-board and described in “ chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 67 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 68 AIX or Linux. Verify that all of the iSCSI BA01000A maximum length allowed. configuration arguments on the operating system comply with the configuration for the iSCSI Host Bus Adapter (HBA), which is the iSCSI initiator. Escala BL460 - Problem Determination and Service Guide...
  • Page 69 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 70 2. If the problem persists: a. Go to “Checkout procedure” on page 148. b. Replace the system-board and chassis assembly, as Replacing the Tier 2 system-board described in “ and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 71 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 72 B, and vice-versa. This situation can also occur when the pool of addresses is not truly divided. Set the DHCP server configuration file to “authoritative”. Verify that the DHCP server is functioning properly. Escala BL460 - Problem Determination and Service Guide...
  • Page 73 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 74 Go to “Checkout procedure” on page 148. b. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 75 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 76 The media is write-protected a. Go to “Checkout procedure” on page 148. b. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 77 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 78 5. If the problem persists: period a. Go to “Checkout procedure” on page 148. b. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 79 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 80 10/100 Mbps Ethernet card failure described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Verify that the MAC address programmed in the BA153002 Gigabit Ethernet adapter failure FLASH/EEPROM is correct. Escala BL460 - Problem Determination and Service Guide...
  • Page 81 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 82 1. Go to “Checkout procedure” on page 148. 2. Replace the system-board and chassis assembly, as BA180008 PCI device Fcode evaluation error described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 83 Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. The FDDI adapter Fcode driver is not Bull may produce a compatible driver in the future, but does BA180100 not guarantee one. supported on this server.
  • Page 84 An open firmware stack-depth assert a. Go to “Checkout procedure” on page 148. BA210004 failed. b. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 85 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 86 Go to “Checkout procedure” on page 148. hardware error that was reported. b. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 87 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 88 Go to “Checkout procedure” on page 148. BA340005 location code mapping table was b. Replace the system-board and chassis assembly, as corrupted. described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 89 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 90 Go to “Checkout procedure” on page 148. BA400002 size mismatch. b. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 91: Post Progress Codes (Checkpoints)

    2.4.2 POST progress codes (checkpoints) When you turn on the blade server, the power-on self-test (POST) performs a series of tests to check the operation of the blade server components. Use the management module to view progress codes that offer information about the stages involved in powering on and performing an initial program load (IPL).
  • Page 92: C1001F00 To C1645300 Checkpoints

    Hardware object manager: (HOM): 2. Replace the system-board and chassis assembly, C1009x02 erase HOM IPL step in progress as described in “Replacing the Tier 2 system- board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 93 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 94 1. Go to “Checkout procedure” on page 148. Processor interface alignment procedure 2. Replace the system-board and chassis assembly, C1009x5C in progress as described in “Replacing the Tier 2 system- board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 95 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 96 1. Go to “Checkout procedure” on page 148. 2. Replace the system-board and chassis assembly, C1009xB0 ASIC I/O initialization step in progress as described in “Replacing the Tier 2 system- board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 97 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 98 2. Replace the system-board and chassis assembly, C1645300 processor and the secondary service as described in “Replacing the Tier 2 system- processor. board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 99: C2001000 To C20082Ff Checkpoints

    2.4.2.2 C2001000 to C20082FF Virtual service processor checkpoints The C2xx progress codes indicate the progress of a partition IPL that is controlled by the virtual service processor. The virtual service processor progress codes end after the environment setup completes and the specific operating system code continues the IPL. The virtual service processor can start a variety of operating systems.
  • Page 100 1. Go to “Recovering the system firmware” on page 186. 2. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and C2002400 Begin powering on slots chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 101 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 102 1. Go to “Recovering the system firmware” on page 186. C2006060 Waiting for LID load to complete 2. Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 103 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 104: C700Xxxx Server Firmware Ipl Status Checkpoints

    Action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 197 to determine which components are • CRUs and which components are FRUs. Escala BL460 - Problem Determination and Service Guide...
  • Page 105: Ca000000 To Ca2799Ff Checkpoints

    Table 2-17. C700xxxx Server firmware IPL status checkpoints Progress code Description Action 1. Shutdown and restart the blade server from the permanent-side image. 2. Check for updates to the system firmware. 3. Update the firmware. A problem has occurred with the system C700xxxx Checkout procedure 4.
  • Page 106 PCI probe process completed, create 2. Replace the system-board and chassis assembly, CA00D001 PCI bridge interrupt routing properties Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 107 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 108 ” on page 148. 2. Replace the system-board and chassis assembly, CA00E136 Create BSR node Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 109 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 110 ” on page 148. 2. Replace the system-board and chassis assembly, CA00E155 Probing PCI bridge secondary bus Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 111 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 112 Checkout procedure a. Go to “ ” on page 148. b. Replace the system-board and chassis Replacing the assembly, as described in “ Tier 2 system-board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 113 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 114 ” on page 148. 2. Replace the system-board and chassis assembly, CA00E1B2 XOFF received, waiting for XON Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 115 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 116 ” on page 148. Validate NVRAM, initialize partitions as 2. Replace the system-board and chassis assembly, CA00E440 needed Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 117 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 118 ” on page 148. 2. Replace the system-board and chassis assembly, CA00E890 Starting to initialize open firmware Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 119 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 120 CA2799FD and CA2799FF are not alternating and you must perform the following procedure: 1. Shut down the blade server. 2. Restart it using the permanent boot image. 3. Reject the temporary image. Escala BL460 - Problem Determination and Service Guide...
  • Page 121: D1001Xxx To D1Xx3Fff Dump Codes

    2.4.2.6 D1001xxx to D1xx3FFF Service processor dump codes D1xx service processor dump status codes indicate the cage or node ID that the dump component is processing, the node from which the hardware data is collected, and a counter that increments each time that the dump processor stores 4K of dump data. Service processor dump status codes use the format, D1yy1xxx, where yy and xxx can be any number or letter.
  • Page 122 ” on page 148. D1171xxx Dump registry –l command 2. Replace the system-board and chassis assembly, Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 123 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 124 ” on page 148. D1391xxx Check for valid dump sequence 2. Replace the system-board and chassis assembly, Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 125: D1Xx3Y01 To D1Xx3Yf2 Checkpoints

    If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 126 ” on page 148. 2. Replace the system-board and chassis assembly, D1xx3yF0 Memory collection set-up Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 127: D1Xx900C To D1Xxc003 Checkpoints

    If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 128 ” on page 148. 2. Replace the system-board and chassis assembly, D1xxC003 Hypervisor handshaking is complete Replacing the Tier 2 system- as described in “ board and chassis assembly ” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 129: Service Request Numbers (Srns)

    2.4.3 Service request numbers (SRNs) Service request numbers (SRNs) are error codes that the operating system generates. The codes have three digits, a hyphen, and three or four digits after the hyphen. SRNs can be viewed using the AIX diagnostics or the Linux service aid “diagela” if it is installed.
  • Page 130: 101-711 Through Ffc-725 Srns

    A machine check occurred. Go to “Performing the checkout procedure” on page 149 An encoded SRN was displayed. Go to “Performing the checkout procedure” on page 149 111-108 111-121 There is a display problem. Go to “Performing the checkout procedure” on page 149 Escala BL460 - Problem Determination and Service Guide...
  • Page 131 2. There is unrestricted air flow around the system. 3. All system covers are closed. 4. Verify that all fans in the Bull Blade Chassis - Enterprise are operating correctly. Sensor indicates a FRU has failed. Use the failing function codes, use the physical location 651-159 code(s) from the diagnostic problem report screen to determine the FRUs.
  • Page 132 ECC correctable error. Go to “Performing the checkout procedure” on page 149. 651-65B ECC correctable error. Go to “Performing the checkout procedure” on page 149. Correctable error threshold exceeded. Go to “Performing the checkout procedure” on page 651-664 149. Escala BL460 - Problem Determination and Service Guide...
  • Page 133 Description and Action Correctable error threshold exceeded. Go to “Performing the checkout procedure” on page 651-665 149. Correctable error threshold exceeded. Go to “Performing the checkout procedure” on page 651-666 149. Correctable error threshold exceeded. Go to “Performing the checkout procedure” on page 651-669 149.
  • Page 134 1. Shut the system down. 651-810 2. Visually inspect the power cables and reseat the connectors. 3. Run the following command diag -Avd sysplanar0. When the Resource Repair Action menu displays, select sysplanar0. Escala BL460 - Problem Determination and Service Guide...
  • Page 135 Description and Action Under voltage condition was detected Do the following procedure before replacing any FRUs: 1. Shut the system down. 651-811 2. Visually inspect the power cables and reseat the connectors. 3. Run the following command diag -Avd sysplanar0. When the Resource Repair Action menu displays, select sysplanar0.
  • Page 136 3. If the 8-digit error and location codes were NOT reported, then run diagnostics in problem determination mode and record and report the 8-digit error and location codes for this SRN. 814-112 The NVRAM test failed. Go to “Performing the checkout procedure” on page 149. Escala BL460 - Problem Determination and Service Guide...
  • Page 137 Description and Action 814-113 The VPD test failed. Go to “Performing the checkout procedure” on page 149. 814-114 I/O Card NVRAM test failed. Go to “Performing the checkout procedure” on page 149. The floating-point processor test failed. Go to “Performing the checkout procedure” on page 815-100 149.
  • Page 138 Configuration error, incorrect connection between cascaded enclosures. Go to “Performing 2506-4010 the checkout procedure” on page 149. Configuration error, connections exceed IOA design limits. Go to “Performing the checkout 2506-4020 procedure” on page 149. Escala BL460 - Problem Determination and Service Guide...
  • Page 139 Description and Action Configuration error, incorrect multipath connection. Go to “Performing the checkout 2506-4030 procedure” on page 149. Configuration error, incomplete multipath connection between controller and enclosure 2506-4040 detected. Go to “Performing the checkout procedure” on page 149. Configuration error, incomplete multipath connection between enclosure and device 2506-4041 detected.
  • Page 140 Device problem. Perform diagnostics on the device and retry the operation. 2506-FFF6 Device detected recoverable error. Retry the operation. 2506-FFFA Temporary device bus error. Retry the operation. 2506-FFFE Temporary device bus error. Retry the operation. Escala BL460 - Problem Determination and Service Guide...
  • Page 141 Description and Action Adapter configuration error. 1. Check the management-module event log. If an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. 252B-101 252B 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board and chassis assembly. Permanent adapter failure.
  • Page 142 1. Check the management-module event log. If an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. 256D-201 256D 221 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 143 Description and Action Error log analysis indicates adapter. 1. Check the management-module event log. If an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. 256D-601 256D 2. Replace any parts reported by the diagnostic program. 3.
  • Page 144 Enhanced Error Handling Failure on the Fibre Channel adapter card. Replace the 8Gb 2607-110 2607 PCIe Fibre Channel Expansion Card. Configuration Register Test Failure for the Fibre Channel adapter card. Go to “Performing 2607-201 2607 221 the checkout procedure” on page 149. Escala BL460 - Problem Determination and Service Guide...
  • Page 145 Description and Action PCI Wrap Test Failure for the Fibre Channel adapter card. Replace the 8Gb PCIe Fibre 2607-203 2607 Channel Expansion Card. DMA Test Failure for the Fibre Channel adapter card. Go to “Performing the checkout 2607-204 2607 221 procedure”...
  • Page 146 1. Check the management-module event log. If an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. FFC-725 2. Replace any parts reported by the diagnostic program. 3. Go to “Performing the checkout procedure” on page 149. Escala BL460 - Problem Determination and Service Guide...
  • Page 147: Meaning Of The Last Character (X) After The Hyphen

    2.4.3.3 A00-FF0 through A24-xxx SRNs AIX might generate service request numbers (SRNs) from A00-FF0 to A24-xxx. Note Some SRNs in this sequence might have 4 rather than 3 digits after the dash (–). Table 2-23 shows the meaning of an x in any of the following SRNs, such as A01-00x. Table 2-23.
  • Page 148 1. Check the management-module event log; if an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. A02-12x I/O Host Bridge time-out error. 2. If no entry is found, replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 149 Description FRU/action 1. Check the management-module event log; if an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. A02-13x I/O Host Bridge address/data parity error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 150 “POST progress codes (checkpoints)” on page 71. A05-10x System shutdown due to FRU that has failed. 2. If no entry is found, replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 151 Description FRU/action 1. Check the management-module event log; if an error was recorded by the system, see “POST progress System shutdown due to power fault with an codes (checkpoints)” on page 71. A05-14x unspecified cause. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 152 “POST progress codes (checkpoints)” on page 71. platform. The system is operating in A10-200 degraded mode. 2. If no entry is found, replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 153 Description FRU/action 1. Check the management-module event log; if an error was recorded by the system, see “POST progress The processor has been deconfigured. The codes (checkpoints)” on page 71. A10-210 system is operating in degraded mode. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 154 “POST progress A non-critical error has been detected, an codes (checkpoints)” on page 71. A12-12x I/O host bridge time-out error. 2. If no entry is found, replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 155 Description FRU/action 1. Check the management-module event log; if an error was recorded by the system, see “POST progress A non-critical error has been detected, a I/O codes (checkpoints)” on page 71. A12-13x host bridge address/data parity error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 156 2. Check the management-module event log; if an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. 3. If no entry is found, replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 157 Description FRU/action 1. Check the management-module event log; if an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. A15-07x Sensor indicates a power supply has failed. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 158 “POST progress A non-critical error has been detected, a codes (checkpoints)” on page 71. A1D-12x service processor error accessing fan sensor. 2. If no entry is found, replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 159 Description FRU/action 1. Check the management-module event log; if an error A non-critical error has been detected, a was recorded by the system, see “POST progress codes (checkpoints)” on page 71. A1D-13x service processor error accessing a thermal sensor. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 160 (checkpoints)” on page 71. A24-xxx Spurious interrupts have exceeded threshold. 2. Replace part numbers reported by the diagnostic program. 3. If no entry is found, replace the system-board and chassis assembly. Escala BL460 - Problem Determination and Service Guide...
  • Page 161: Table 2-25. Ssss-102 Through Ssss-640 Srns

    2.4.3.4 SCSD devices SRNs (ssss-102 through ssss-640) These service request numbers (SRNs) identify a SCSD (Self-Configuring SCSI Device) problem. Use Table 2-25 to identify an SRN when you suspect a SAS hard disk or Solid State Disk (SSD) device problem. Replace the parts in the order that the failing function codes (FFCs) are listed.
  • Page 162 2. Replace any parts reported by the diagnostic program. 3. Replace the system board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 229. Escala BL460 - Problem Determination and Service Guide...
  • Page 163 Description and action A software error was caused by a hardware failure. 1. Check the management-module event log. If an error was recorded by the system, see “POST progress codes (checkpoints)” on page 71. ssss-126 ssss 252B 2. Replace any parts reported by the diagnostic program. 3.
  • Page 164: Failing Function Codes 151 Through 2D02

    Check management-module event log for a blower or fan fault. See the documentation that comes with the Bull Blade Chassis - Enterprise. System-board and chassis assembly System-board and chassis assembly (cache problem) Escala BL460 - Problem Determination and Service Guide...
  • Page 165 Description and notes System-board and chassis assembly System-board and chassis assembly Common Memory Logic problem for memory DIMMs. Note: If more than one pair of memory DIMMs are reported missing: 1. Replace the system-board and chassis assembly 2. Replace the memory DIMM at the physical location code that is reported System-board and chassis assembly System-board and chassis assembly System-board and chassis assembly...
  • Page 166 System-board and chassis assembly (generic USB reference to controller/adapter) 2D02 QLogic 8Gb Fibre Channel Expansion Card, (CFFh/PCIe) 2E12 QLogic 4Gb Fibre Channel 1Xe PCI-Express Expansion Card (CIOv) 2E13 QLogic 8Gb Fibre Channel 1Xe PCI-Express Expansion Card (CIOv) 2E14 Escala BL460 - Problem Determination and Service Guide...
  • Page 167: Error Logs

    (PHYP), and the service processor ™ write errors to the Bull Blade Chassis - Enterprise management module event log. Select the Monitors → Event Log option in the management module Web interface to view entries that are currently stored in the management-module event log. This log includes entries for events that are detected by the blade servers.
  • Page 168: Checkout Procedure

    If the blade server front panel shows no LEDs, verify the blade server status and errors • in the management module Web interface; also see “Solving undetermined problems” on page 194. • If device errors occur, see “Troubleshooting tables” on page 158. Escala BL460 - Problem Determination and Service Guide...
  • Page 169: Performing The Checkout Procedure

    2.6.2 Performing the checkout procedure Follow this procedure to perform the checkout. Step 001 Perform the following steps: Update the firmware to the current level, as described in “Updating the firmware” on page 231. You might also have to update the management module firmware. If you did not update the firmware for some reason, power off the blade server for 45 seconds before powering it back on.
  • Page 170 152 or “Starting stand-alone diagnostics from a NIM server” on page 153. If you have replaced the failing component, perform system verification for the component. See Using the diagnostics program” on page 155. This ends the AIX procedure. Escala BL460 - Problem Determination and Service Guide...
  • Page 171 Step 007 Perform the following steps: Use the management-module Web interface to make sure that the device from which you load the stand-alone diagnostics is set as the first device in the blade server boot sequence. Turn off the blade server and wait 45 seconds before proceeding. Turn on the blade server and establish a SOL session.
  • Page 172: Verifying The Partition Configuration

    Stop all programs; then, shut down the operating system and shut down the blade server. Refer to the documentation that comes with your operating system documentation for information about shutting down the operating system. Escala BL460 - Problem Determination and Service Guide...
  • Page 173: Starting Stand-Alone Diagnostics From A Nim Server

    Press the CD button on the front of the blade server to give it ownership of the Bull Blade Chassis – Enterprise media tray. Using the management module Web interface, make sure that: The blade server firmware is at the latest version.
  • Page 174 In this case, press Esc and the number in the screen menus. For example, instead of F3 you can press Esc and 3. When testing is complete, press F3 until the Diagnostic Operating Instructions screen is displayed; then press F3 again to exit the diagnostic program. Escala BL460 - Problem Determination and Service Guide...
  • Page 175: Using The Diagnostics Program

    2.8.4 Using the diagnostics program Follow the basic procedures for running the diagnostics program. Start the diagnostics from the AIX operating system, from a CD, or from a management server. See “Starting AIX concurrent diagnostics” on page 152, “Starting stand-alone diagnostics from a CD”...
  • Page 176: Boot Problem Resolution

    If you are attempting to boot from a hard disk drive, go to Step 004. If you are attempting to boot from the network: Make sure that the network cabling to the Bull Blade Chassis – Enterprise network switch is correct.
  • Page 177 Turn the blade server power off; then, turn it on and retry the boot operation. If the boot fails, try a known-good bootable CD. If possible, try to boot another blade server in the Bull Blade Chassis - Enterprise to verify that the CD or DVD drive is functional.
  • Page 178: Troubleshooting Tables

    If these symptoms relate to shared Bull Blade Chassis - Enterprise resources, see “Solving shared Bull Blade Chassis – Enterprise resource problems” on page 188. If you cannot find the problem in these tables, see “Running the diagnostics program” on page 152 for information about testing the blade server.
  • Page 179: Intermittent Problems

    The blade server and the monitor are turned on. 2. Replace the keyboard. 3. Replace the management module on the Bull Blade Chassis - Enterprise. See the Hardware Maintenance Manual and Troubleshooting Guide or Problem Determination and Service Guide for your Bull Blade Chassis - Enterprise.
  • Page 180: Management Module Service Processor Problems

    Symptom Action Service processor in the Disconnect the Bull Blade Chassis - Enterprise from all electrical sources, wait for 30 management module reports a seconds, reconnect it to the electrical sources, and restart the blade server. general monitor failure.
  • Page 181: Memory Problems

    2.10.6 Memory problems Identify memory problem symptoms and what corrective actions to take. • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 182: Monitor Or Video Problems

    Maintenance Manual and Troubleshooting Guide or Problem Determination and Service Guide for your Bull Blade Chassis - Enterprise. Only the cursor Make sure that the keyboard/video ownership on the Bull Blade Chassis - Enterprise has not appears. been switched to another blade server.
  • Page 183: Network Connection Problems

    Blade bays and are configured and operating correctly. See the network. Hardware Maintenance Manual and Troubleshooting Guide or Problem Determination and Service Guide for your Bull Blade Chassis - Enterprise for details. − The settings in the I/O module are appropriate for the blade server (settings in the I/O module are blade-specific).
  • Page 184: Pci Expansion Card (Piocard) Problem Isolation Procedure

    3. Use the hexadecimal value of the DSA to determine the location code of the failing CRU. − If the value is 05120010, the location code is P1-C11. If the value is xxxx 0100, the location − code is P1-C12. Escala BL460 - Problem Determination and Service Guide...
  • Page 185: Optional Device Problems

    Action 1. Make sure that: An optional device that was just installed does not work. The option is designed for the blade server. Contact your Bull support − representative. − You followed the installation instructions that came with the option.
  • Page 186 The blade server does not turn on. a. The power LED on the front of the Bull Blade Chassis - Enterprise is on. b. The LEDs on all the Chassis power modules are on. c. The blade server is in a blade bay that is supported by the power modules installed in the Bull Blade Chassis - Enterprise.
  • Page 187: Power Hypervisor (Phyp) Problems

    3. Check the bus and I/O adapter allocations for the partition. Verify that the partition has load source and console I/O resources. 4. Check the IPL mode of the system or failing partition. 5. For further assistance, contact Bull Support. Chapter 2. Diagnostics...
  • Page 188 If the value is 05120010, the location code is P1-C11. system reference code If the value is xxxx 0100, the location code is P1-C12. (SRC) identifies the − location code of the failing component. Escala BL460 - Problem Determination and Service Guide...
  • Page 189: Service Processor Problems

    2.10.14 Service processor problems The baseboard management controller (BMC) is a flexible service processor that provides error diagnostics with associated error codes, and fault isolation procedures for troubleshooting. Note Resetting the service processor causes a POWER6 reset/reload, which generates a dump. The dump is recorded in the management module event log.
  • Page 190 Escala BL460 blade server shut down. indicates that the Blade encountered a 2. Resolve any problems. problem, and the 3. Remove the blade from the Bull Blade Chassis - Enterprise and then blade server was reinsert the blade server. automatically shut down as a result.
  • Page 191 2. If the problem persists, replace each memory DIMM, by following the action for symbolic FRU MEMDIMM. 3. Install the blade server into the Bull Blade Chassis - Enterprise after each DIMM replacement and restart the blade to verify if the problem is solved.
  • Page 192 [-minute MM] [-timezone TZ] chdate mmddHHMM[YYyy|yy] [-timezone TZ] Removing 2. If the problem persists, replace the battery, as described in “ the battery ” on page and “Installing the battery” on page 225. Escala BL460 - Problem Determination and Service Guide...
  • Page 193 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 194 Collect the dump for support and power off and power on the blade server. 5. If an A1xx SRC has not remained more than 40 minutes, call Bull Support. FSPSP16 Save any error log Contact Bull Support.
  • Page 195 11. degraded. Array bit steering may be able 2. Remove the blade server from the Bull Blade Chassis - Enterprise and to correct this problem reinsert the blade server into the Bull Blade Chassis - Enterprise. without replacing Turning on the blade 3.
  • Page 196 1. Set the enclosure feature code using SMS, which automatically resets the service processor. 2. If the problem persists, call Bull Support. If you do not see your reason code listed, call Bull Support. Escala BL460 - Problem Determination and Service Guide...
  • Page 197 Action Procedure Code FSPSP34 The memory cards are Install a DIMM for each of the dual processors on the ESCALA BL460 blade plugged in an invalid server. Install the first pair in DIMM connectors 2 and 4. configuration and Look for the following error codes in order. Follow the procedure for the first cannot be used by the code you find.
  • Page 198 Replace the system board and chassis assembly, as described in Replacing the Tier 2 system-board and chassis assembly occurred between the “ ” on page Service Processor and 229. the network switch. Escala BL460 - Problem Determination and Service Guide...
  • Page 199 2. If the temperature is within the acceptable range, check the front and rear replacing any parts. of the Bull Blade Chassis - Enterprise to verify that the each is free of obstructions that would impede the airflow. If there are obstructions, you must clear the obstructions.
  • Page 200 2. Resolve any problems. reporting that 12V dc 3. Remove the blade from the Bull Blade Chassis - Enterprise and then is not present on the reinsert the blade server. Bull Blade Chassis - Enterprise midplane.
  • Page 201: Software Problems

    2.10.15 Software problems Use this information to recognize software problem symptoms and to take corrective actions. • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 197 to determine which components are CRUs and which components are FRUs.
  • Page 202: Light Path Diagnostics

    Press and hold the light path diagnostics switch to relight the LEDs that were lit before you removed the blade server from the Bull Blade Chassis - Enterprise. The LEDs will remain lit for as long as you press the switch, to a maximum of 25 seconds.
  • Page 203: Figure 2-1. Light Path Diagnostic Leds

    Figure 2-1 shows the locations of error LEDs on the system board: Figure 2-1. Light path diagnostic LEDs Callout System-board LEDs Light path power LED. System board LED (Px). SAS hard disk drive LED or SAS solid-state drive LED. DIMM 1-4 LEDs. 1Xe connector LED.
  • Page 204: Light Path Diagnostics Leds

    A system board error occurred. 1. Replace the blade server cover, reinsert the blade server error in the Bull Blade Chassis - Enterprise, and then restart the P1-C9 MGMT CRD blade server. 2. Check the management-module event log for information about the error.
  • Page 205: Firmware Problem Isolation

    A system board and chassis 1. Replace the blade server cover, reinsert the blade server assembly error has occurred. A in the Bull Blade Chassis - Enterprise, and then restart the P1 SYS BRD microprocessor failure shows up as blade server.
  • Page 206: Recovering The System Firmware

    If your system hangs, access the management module and select Blade Tasks → Configuration → Boot Mode to show the Escala BL460 blade server in the list of blade servers in the Bull Blade Chassis - Enterprise. Click the appropriate blade server and select Permanent to force the system to start from the PERM image.
  • Page 207: Recovering The Temp Image From The Perm Image

    Verify that the system starts from the TEMP image, as described in “Verifying the system firmware levels.” 2.13.3 Recovering the TEMP image from the PERM image To recover the TEMP image from the PERM image, you must perform the reject function. The reject function copies the PERM image into the TEMP image.
  • Page 208: Committing The Temp System Firmware Image

    Solving shared Bull Blade Chassis – Enterprise resource problems Problems with Bull Blade Chassis – Enterprise shared resources might appear to be in the blade server, but might actually be a problem in a Bull Blade Chassis - Enterprise component.
  • Page 209: Solving Shared Keyboard Problems

    Solving shared keyboard problems Problems with shared resources might appear to be in the blade server, but might actually be a problem in a Bull Blade Chassis - Enterprise keyboard component. To check the general function of shared keyboard resources, perform the following procedure.
  • Page 210: Solving Shared Media Tray Problems

    Problems with shared resources might appear to be in the blade server, but might actually be a problem in a Bull Blade Chassis - Enterprise media tray component. To check the general function of shared Bull Blade Chassis – Enterprise media tray resources, perform the following procedure.
  • Page 211 Verify that the management module is operating correctly. See the Problem Determination and Service Guide or the Hardware Maintenance Manual and Troubleshooting Guide for your Bull Blade Chassis - Enterprise. Some Bull Blade Chassis - Enterprise types have several management-module components that you might test or replace.
  • Page 212: Solving Shared Network Connection Problems

    Problems with shared resources might appear to be in the blade server, but might actually be a problem in a Bull Blade Chassis - Enterprise network connection resource. To check the general function of shared Bull Blade Chassis – Enterprise network connection resources, perform the following procedure.
  • Page 213: Solving Shared Power Problems

    Verify that the LEDs on all the Blade Chassis power modules are lit. Verify that power is being supplied to the Bull Blade Chassis - Enterprise. Verify that the installation of the blade server type is supported by the Bull Blade Chassis - Enterprise.
  • Page 214: Solving Undetermined Problems

    Check the LEDs on all the power supplies of the Bull Blade Chassis - Enterprise where the blade server is installed. If the LEDs indicate that the power supplies are working correctly, and reseating the blade server does not correct the problem, complete the following steps: Make sure that the control panel connector is correctly seated on the system board.
  • Page 215 Management Module User’s Guide for more information. Turn off the blade server. Remove the blade server from the Bull Blade Chassis - Enterprise and remove the cover. Remove or disconnect the following devices, one at a time, until you find the failure.
  • Page 216: Calling Bull For Service

    2.16 Calling Bull for service Call Bull for service after you collect as much as possible of the following information. Before calling for service, collect as much as possible of the following available information: Machine type and model • •...
  • Page 217: Chapter 3. Parts Listing

    Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your • responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request Bull •...
  • Page 218: Parts Table

    Tray, SAS hard disk drive 31R2239 Hard disk drive, 73 GB 10K RPM SFF SAS HDD and screws 26K5779 2553 Hard disk drive, 146 GB 10K RPM SFF SAS HDD and screws 42D0422 2553 (4) (option) Escala BL460 - Problem Determination and Service Guide...
  • Page 219 Hard disk drive, 300 GB 10K RPM SFF SAS HDD and screws 42D0628 2553 (4) (option) Solid State Drive (SSD) 69 GB and screws (4) (option) 44V6825 2553 Disk drive filler 40K5928 Label, FRU list 44V7312 Label, OEM FRU list 44V7313 Label, System service 44V6772...
  • Page 220 Escala BL460 - Problem Determination and Service Guide...
  • Page 221: Chapter 4. Removing And Replacing Blade Server Components

    Replaceable components are of three types: Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your • responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. • Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request Bull to install it, at no additional charge, under the type of warranty service that is designated for your blade server.
  • Page 222: System Reliability Guidelines

    While the device is still in its static-protective package, touch it to an unpainted metal • part of the Bull Blade Chassis - Enterprise or any unpainted metal surface on any other grounded rack component in the rack you are installing the device in for at least 2 seconds.
  • Page 223: Returning A Device Or Component

    Removing the blade server from a Bull Blade Chassis - Enterprise Remove the blade server from the Bull Blade Chassis - Enterprise to access options, connectors, and system-board indicators. Figure 4-1. Removing the blade server from the Bull Blade Chassis - Enterprise...
  • Page 224: Installing The Blade Server In A Bull Blade Chassis - Enterprise

    Installing the blade server in a Bull Blade Chassis - Enterprise Install the blade server in a Bull Blade Chassis - Enterprise to use the blade server. Figure 4-2. Installing the blade server in a Bull Blade Chassis - Enterprise Perform the following procedure to install a blade server in a Bull Blade Chassis - Enterprise.
  • Page 225 11. Optional: Write identifying information on one of the user labels that come with the blade servers and place the label on the Bull Blade Chassis - Enterprise bezel. mportant Do not place the label on the blade server or in any way block the ventilation holes on the blade server.
  • Page 226: Removing And Replacing Tier 1 Crus

    Removing and replacing Tier 1 CRUs Replacement of Tier 1 customer-replaceable units (CRUs) is your responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. The illustrations in this documentation might differ slightly from your hardware.
  • Page 227 Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See "Removing the blade server from a Bull Blade Chassis - Enterprise" on page 203. Carefully lay the blade server on a flat, static-protective surface, with the cover side...
  • Page 228: Installing And Closing The Blade Server Cover

    Installing and closing the blade server cover Install and close the cover of the blade server before you insert the blade server into the Bull Blade Chassis - Enterprise. Do not attempt to override this important protection. Figure 4-4. Installing the cover...
  • Page 229: Removing The Bezel Assembly

    Install the blade server into the Bull Blade Chassis - Enterprise. See Installing the blade server in a Bull Blade Chassis - Enterprise, on page 204. 4.4.3 Removing the bezel assembly Remove the bezel assembly. Figure 4-5. Removing the bezel assembly Read “Safety”...
  • Page 230: Installing The Bezel Assembly

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis - Enterprise. See Installing the blade server in a Bull Blade Chassis - Enterprise, on page 204.
  • Page 231: Removing A Drive

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the lade server from the Bull Blade Chassis - Enterprise. See “Removing the blade server from a Bull Blade Chassis - Enterprise” on page 203.
  • Page 232: Installing A Drive

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the lade server from the Bull Blade Chassis - Enterprise. See “Removing the blade server from a Bull Blade Chassis - Enterprise” on page 203.
  • Page 233: Removing A Memory Module

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis - Enterprise. See "Installing the blade server in a Bull Blade Chassis - Enterprise" on page 204.
  • Page 234: Installing A Memory Module

    Figure 4-10. DIMM connectors Perform the following procedure to install a DIMM. Read “Safety” on page vii and the “Installation guidelines” on page 201. Read the documentation that comes with the DIMMs. Escala BL460 - Problem Determination and Service Guide...
  • Page 235 Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. 12. Install the blade server into the Bull Blade Chassis - Enterprise. See Installing the blade server in a Bull Blade Chassis - Enterprise, on page 204.
  • Page 236: Removing The Management Card

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See “Removing the blade server from a Bull Blade Chassis - Enterprise” on page 203.
  • Page 237: Installing The Management Card

    Touch the static-protective package that contains the management card to any unpainted metal surface on the Bull Blade Chassis - Enterprise or any unpainted metal surface on any other grounded rack component; then, remove the management card ( as shown by in the figure) from its package.
  • Page 238 Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis - Enterprise. See Installing the blade server in a Bull Blade Chassis - Enterprise, on page 204.
  • Page 239: Removing And Installing An I/O Expansion Card

    The blade server supports various types of I/O expansion cards, including Gigabit Ethernet, Fibre Channel, and Myrinet expansion cards. Verify that any expansion card that you are using is supported by the Escala BL460 blade server. For example, the following expansion cards are not supported by the Escala BL460...
  • Page 240: Figure 4-14. Installing A Ciov Form-Factor Expansion Card

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See "Removing the blade server from a Bull Blade Chassis - Enterprise" on page 203.
  • Page 241 Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See "Removing the blade server from a Bull Blade Chassis - Enterprise" on page 203.
  • Page 242: Figure 4-15. Removing A Combination-Form-Factor Expansion Card

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See "Removing the blade server from a Bull Blade Chassis - Enterprise" on page 203.
  • Page 243: Figure 4-16. Installing A Combination-Form-Factor Expansion Card

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See "Removing the blade server from a Bull Blade Chassis - Enterprise" on page 203.
  • Page 244: Removing The Battery

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis - Enterprise unit. See "Installing the blade server in a Bull Blade Chassis - Enterprise on page 204.
  • Page 245: Installing The Battery

    Use your finger to press down on one side of the battery; then, slide the battery out from its socket. The spring mechanism will push the battery out toward you as you slide it from the socket. Note You might need to lift the battery clip slightly with your fingernail to make it easier to slide the battery.
  • Page 246 Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis - Enterprise. See “Installing the blade server in a Bull Blade Chassis - Enterprise” on page 204.
  • Page 247: Removing The Disk Drive Tray

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See “Removing the blade server from a Bull Blade Chassis - Enterprise” on page 203.
  • Page 248: Installing The Hard Disk Drive Tray

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis - Enterprise. See “Installing the blade server in a Bull Blade Chassis - Enterprise” on page 204.
  • Page 249: Replacing The Tier 2 System-Board And Chassis Assembly

    Read “Safety” on page vii and the “Installation guidelines” on page 201. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis - Enterprise. See “Removing the blade server from a Bull Blade Chassis - Enterprise” on page 203.
  • Page 250 12. Place the RID tag on the bottom of the blade server chassis. 13. Install the blade server into the Bull Blade Chassis - Enterprise. See “Installing the blade server in a Bull Blade Chassis - Enterprise” on page 204.
  • Page 251: Chapter 5. Configuring

    Update the firmware and use the management module and the system management services (SMS) to configure the Escala BL460 blade server. Updating the firmware Bull periodically makes firmware updates available for you to install on the blade server, the management module, or expansion cards in the blade server. mportant:...
  • Page 252 You can also install an update permanently on either AIX or Linux, as described in: − Using AIX commands to install a firmware update permanently Using Linux commands to install a firmware update permanently − Escala BL460 - Problem Determination and Service Guide...
  • Page 253: Configuring The Blade Server

    Configuring the blade server While the firmware is running POST and before the operating system starts, a POST menu with POST indicators is displayed. The POST indicators are the words Memory, Keyboard, Network, SCSI, and Speaker that are displayed as each component is tested. You can then select configuration utilities from the POST menu.
  • Page 254: Using The Sms Utility

    Note: If a device that you are trying to select (such as a USB CD drive in the Blade media tray) is not displayed in the Select Device Type menu, select List all Devices and select the device from that menu. Escala BL460 - Problem Determination and Service Guide...
  • Page 255: Creating A Ce Login

    Bull Blade Chassis - Enterprise, and the operating system that is installed. For example, each Ethernet controller on the Escala BL460 blade server system board is routed to a different I/O module in I/O module bay 1 or module bay 2 of the Bull Blade Chassis - Enterprise.
  • Page 256: Blade Server Ethernet Controller Enumeration

    I/O-module bays 3 and 4, if these bays are supported by your Bull Blade Chassis - Enterprise. You can verify which controller on the card is routed to which I/O-module bay by performing the same test and using a controller on the expansion card and a compatible switch module or pass-thru module in I/O-module bay 3 or 4.
  • Page 257: Mac Addresses For Host Ethernet Adapters

    I/O modules in the Bull Blade Chassis - Enterprise. The Escala BL460 blade server uses two physical HEA ports and 14 logical HEA ports to share the two integrated physical Ethernet adapters on the blade server. The 14 logical...
  • Page 258: Updating Ibm System Director

    To install the IBM System Director updates and any other applicable updates and interim fixes, complete the following steps. Check for the latest version of IBM System Director. Install IBM System Director. Download and install any applicable updates or interim fixes for the blade server. Escala BL460 - Problem Determination and Service Guide...
  • Page 259: Appendix A. Getting Help And Technical Assistance

    Bull provides a wide variety of sources to assist you. This appendix indicates where to go for additional information about Bull and Bull products, what to do if you experience a problem with your Bull Blade system, and who to call for service if necessary.
  • Page 260 Escala BL460 - Problem Determination and Service Guide...
  • Page 261: Appendix B. Notices

    These products are offered and warranted solely by third parties. Bull makes no representations or warranties with respect to non-Bull products. Support (if any) for the non-Bull products is provided by the third party, not Bull.
  • Page 262: Product Recycling And Disposal

    Customer participation is important to minimize any potential effects of EEE on the environment and human health due to the potential presence of hazardous substances in EEE. For proper collection and treatment, contact your local Bull representative. Escala BL460 - Problem Determination and Service Guide...
  • Page 263: Electronic Emission Notices

    Properly shielded and grounded cables and connectors must be used in order to meet FCC emission limits. Bull is not responsible for any radio or television interference caused by using other than recommended cables and connectors or by unauthorized changes or modifications to this equipment.
  • Page 264: European Union Emc Directive Conformance Statement

    This product is in conformity with the protection requirements of EU Council Directive 89/336/EEC on the approximation of the laws of the Member States relating to electromagnetic compatibility. Bull cannot accept responsibility for any failure to satisfy the protection requirements resulting from a non-recommended modification of the product, including the fitting of non-Bull option cards.
  • Page 266 BULL CEDOC 357 AVENUE PATTON B.P.20845 49008 ANGERS CEDEX 01 FRANCE REFERENCE 86 A7 81FB 00...

Table of Contents