Bull Escala EL260B Problem Determination And Service Manual

Hide thumbs Also See for Escala EL260B:
Table of Contents

Advertisement

Escala Blade
Server EL260B
Problem Determination and
Service Guide
REFERENCE
86 A1 36FA 00

Advertisement

Table of Contents
loading

Summary of Contents for Bull Escala EL260B

  • Page 1 Escala Blade Server EL260B Problem Determination and Service Guide REFERENCE 86 A1 36FA 00...
  • Page 3 ESCALA BLADE SERVERS Escala Blade Server EL260B Problem Determination and Service Guide Hardware July 2008 BULL CEDOC 357 AVENUE PATTON B.P.20845 49008 ANGERS CEDEX 01 FRANCE REFERENCE 86 A1 36FA 00...
  • Page 4 Quoting of brand and product names is for information purposes only and does not represent trademark misuse. The information in this document is subject to change without notice. Bull will not be liable for errors contained herein, or for incidental or consequential damages in connection with the use of this material.
  • Page 5: Table Of Contents

    Table of Contents List of Figures......................v List of Tables ......................vi Safety ........................vii Safety statements.......................... viii Guidelines for trained service technicians..................xiv Inspecting for unsafe conditions ..................... xiv Guidelines for servicing electrical equipment ..................xv Chapter 1. Introduction ..................1 Related documentation......................
  • Page 6 Verifying the partition configuration.................. 144 Running the diagnostics program ..................144 2.8.1 Starting AIX concurrent diagnostics ................. 144 2.8.2 Starting stand-alone diagnostics from a CD .............. 144 2.8.3 Starting stand-alone diagnostics from a NIM server........... 146 2.8.4 Using the diagnostics program ................147 Boot problem resolution....................
  • Page 7 Handling static-sensitive devices ................192 4.1.3 Returning a device or component................193 Removing the blade server from a Bull Blade Chassis ............193 Installing the blade server in a Bull Blade Chassis .............. 195 Removing and replacing Tier 1 CRUs ................197 4.4.1...
  • Page 8 Appendix B. Notices .....................239 Important Notes .........................239 Product recycling and disposal.....................240 Electronic emission notices......................241 Industry Canada Class A emission compliance statement ..............241 Australia and New Zealand Class A statement ................241 United Kingdom telecommunications safety requirement..............241 European Union EMC Directive conformance statement ..............242 Taiwanese Class A warning statement ...................
  • Page 9: List Of Figures

    Light path diagnostic LEDs ..................176 Figure 3-1. Parts illustration ......................189 Figure 4-1. Removing the blade server from the Bull Blade Chassis ..........193 Figure 4-2. Installing the blade server in a Bull Blade Chassis............195 Figure 4-3. Removing the cover ....................197 Figure 4-4.
  • Page 10 . Light path diagnostic LED descriptions ................ 177 Table 3-1. Parts table......................... 190 Table 4-1. ESCALA EL260B vital product data ................210 Table 5-1. MAC addressing scheme for physical and logical host Ethernet adapters ......234 Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 11: Safety

    Safety Preface...
  • Page 12: Safety Statements

    English-language caution or danger statement with translated versions of the caution or danger statement in the Bull Safety Attention document. For example, if a caution statement begins with a number 1, translations for that caution statement appear in the Bull Safety Attention document under statement 1.
  • Page 13 Preface...
  • Page 14 Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 15 Preface...
  • Page 16 Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 17 xiii Preface...
  • Page 18: Guidelines For Trained Service Technicians

    Guidelines for trained service technicians Inspecting for unsafe conditions Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 19: Guidelines For Servicing Electrical Equipment

    Guidelines for servicing electrical equipment Preface...
  • Page 20 Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 21: Chapter 1. Introduction

    Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your • responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request Bull •...
  • Page 22: Notices And Statements In This Documentation

    The blade server might have features that are not described in the documentation that comes with the blade server. Review the Planning Guide and the Installation Guide for your Bull Blade Chassis. The information can help you prepare for system installation and configuration.
  • Page 23: Features And Specifications

    Features and specifications Features and specifications of the Bull Escala EL260B blade server are summarized in this overview. The Escala EL260B blade server is used in a Bull Blade Chassis. Notes: • Power, cooling, removable-media drives, external ports, and advanced system management are provided by the Bull Blade Chassis.
  • Page 24 Support for local keyboard and video • Four Universal Serial Bus (USB) buses for communication with keyboard and • removable-media drives Transferable Anchor function (Renesas Technology HD651330 microcontroller) in the • management card Storage: Support for two internal small-form-factor (SFF) Serial Attached SCSI (SAS) drives •...
  • Page 25: Supported Dimms

    Supported DIMMs The Escala EL260B blade serve contains eight memory connectors for industry-standard registered, dual-inline-memory modules (RDIMMs). The DIMMS are very low profile, which means that each DIMM has a height of 18.3 millimeters (mm). Total memory can range from a minimum of 2 gigabytes (GB) to a maximum of 64 See Chapter 3, “Parts listing,”...
  • Page 26: Blade Server Control Panel Buttons And Leds

    Figure 1-1. Blade server control panel buttons and LEDs Keyboard/video select button: When you use an operating system that supports a local console and keyboard, press this button to associate the shared Bull Blade Chassis keyboard and video ports with the blade server. Notes: The operating system in the blade server must provide USB support for the blade •...
  • Page 27 Media-tray select button: Press this button to associate the shared Bull Blade Chassis media tray (removable-media drives and front-panel USB ports) with the blade server. The LED on the button flashes while the request is being processed, then is lit when the ownership of the media tray has been transferred to the blade server.
  • Page 28: Turning On The Blade Server

    Note: The enhanced service processor (BMC) can take as long as three minutes to initialize after you install the Escala EL260B blade server, at which point the LED begins to flash slowly. Activity LED: When this green LED is lit, it indicates that there is activity on the hard disk drive or network.
  • Page 29: Turning Off The Blade Server

    The enhanced service processor (BMC) can take as long as three minutes to initialize after you install the Escala EL260B blade server, at which point the LED begins to flash slowly. While the blade server is starting, the power-on LED on the front of the blade server is lit.
  • Page 30: System-Board Layouts

    Use the management module to turn off the blade server. • The power-control LED can remain on solidly for up to 1 minute after you initiate the power-off process. After you turn off the blade server, wait until the power-control LED is blinking slowly before you initiate the power-on process from the advanced management module to turn on the blade server again.
  • Page 31: System-Board Leds

    Use the illustration of the LEDs on the system board to identify a light emitting diode (LED). Remove the blade server from the Bull Blade Chassis, open the cover to see any error LEDs that were turned on during error processing, and use Figure 1-3 to identify the failing component.
  • Page 32 Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 33: Chapter 2. Diagnostics

    Chapter 2. Diagnostics Use the available diagnostic tools to help solve any problems that might occur in the blade server. The first and most crucial component of a solid serviceability strategy is the ability to accurately and effectively detect errors when they occur. While not all errors are a threat to system availability, those that go undetected are dangerous because the system does not have the opportunity to evaluate and act if necessary.
  • Page 34: Diagnostic Tools

    Use the light path diagnostic LEDs on the system board to identify failing hardware. If the system error LED on the system LED panel on the front or rear of the Bull Blade Chassis is lit, one or more error LEDs on the Bull Blade Chassis components also might be lit.
  • Page 35: Collecting Dump Data

    If you power off the blade through the management module while the service processor is performing a dump, platform dump data is lost. You might be asked to retrieve a dump to send it to Bull Support for analysis. The location of the dump data varies per operating system platform.
  • Page 36: Location Codes

    See “System-board connectors” on page 10 for component locations. Notes: Location codes do not indicate the location of the blade server within the Bull Blade • Chassis. The codes identify components of the blade server only.
  • Page 37: Reference Codes

    Viewing the codes The Escala EL260B blade server does not display checkpoints or error codes on the remote console. The shared Bull Blade Chassis video also does not display the codes. If the POST detects a problem, a 9-word, 8-digit error code is logged in the Blade management-module event log.
  • Page 38: System Reference Codes (Srcs)

    The seventh word is the direct select address, which is 77777777 in the example. Table 2-2. Nine-word system reference code in the management-module event log Index Source Date/Time Text 01/21/2008, (ESCALA EL260B -BC1BLD5E) SYS Blade_05 17:15:14 F/W: Error. Replace UNKNOWN (5008FECF B7001111 22222222 33333333 44444444 55555555 66666666 77777777 88888888 99999999) Depending on your operating system and the utilities you have installed, error messages might also be stored in an operating system log.
  • Page 39 Any message with more detail is highlighted as a link in the System Reference Code column. Click the message to cause the management module to present the additional message detail: D1513901 Created 2007-11-13 19:30:20 Version: 0x02 Words 2-5: 020110F0 52298910 C1472000 200000FF SRC formats...
  • Page 40: 1Xxxyyyy Srcs

    2.4.1.1 1xxxyyyy SRCs The 1xxxyyyy system reference codes are system power control network (SPCN) reference codes. Look for the rightmost 4 characters (yyyy in 1xxxyyyy) in the error code; this is the reference code. Find the reference code in Table 4. Perform all actions before exchanging failing items.
  • Page 41 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 42 ” on page 226. 1. Check the management-module event log for entries that were made around the time that the Escala EL260B blade server shut down. The Blade encountered a 2. Resolve any problems that are found. problem, and the blade...
  • Page 43 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 44: 6Xxxyyyy Srcs

    2.4.1.2 6xxxyyyy SRCs The 6xxxyyyy system reference codes are virtual optical reference codes. Look for the rightmost 4 characters (yyyy in 6xxxyyyy) in the error code; this is the reference code. Find the reference code in Table 2-5. Table 2-5. 6xxxyyyy SRCs Follow the suggested actions in the order in which they are listed in the Action column until the problem is •...
  • Page 45: A1Xxyyyy Service Processor Srcs

    • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 46: A700Yyyy Licensed Internal Code Srcs

    2.4.1.5 A700yyyy Licensed internal code SRCs An A7xx SRC is a licensed internal code SRC that is deprecated in favor of a corresponding B7xx SRC. B7xx SRCs are described in “B700xxxx Licensed internal code SRCs” on page 37. Table 2-8. A700yyyy Licensed internal code SRCs Attention code Description...
  • Page 47 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 48: B181Xxxx Service Processor Early Termination Srcs

    • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 49: B200Xxxx Logical Partition Srcs

    2.4.1.8 B200xxxx Logical partition SRCs A B200xxxx system reference code (SRC) is an error code that is related to logical partitioning. Table 2-11 describes error codes that might be displayed if POST detects a problem. The description also includes suggested actions to correct the problem. Note: For problems persisting after completing the suggested actions, see “Checkout procedure”...
  • Page 50 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 51 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 52 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 53 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 54 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 55 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 56 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 57: B700Xxxx Licensed Internal Code Srcs

    Continue running the system normally. At the earliest System firmware has experienced a low 0200 convenient time or service window, work with Bull Support storage condition to collect a platform dump and restart the system; then, go to “Isolating firmware problems” on page 178.
  • Page 58 Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 59 Error codes Look for and correct B1xxxxxx errors. If there are no serviceable B1xxxxxx errors, or if correcting the errors does not correct the problem, contact Bull support to reset the server firmware settings. Attention: Resetting the server firmware settings results in the loss of all of the partition data that is stored on the System firmware failure.
  • Page 60 5301 detected a problem with the partition system resources. configuration. An unsupported Preferred Operating Work with Bull support to select a supported Preferred System was detected. operating System: then re-IPL the system. 5302 The Preferred Operating System specified is not supported. The IPL will not continue.
  • Page 61 Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 62 Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 63: Ba000010 To Ba400002 Partition Firmware Srcs

    2.4.1.10 BA000010 to BA400002 Partition firmware SRCs The power-on self-test (POST) might display an error code that the partition firmware detects. Try to correct the problem with the suggested action. Table 2-13 describes error codes that might be displayed if POST detects a problem. The description also includes suggested actions to correct the problem.
  • Page 64 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 65 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 66 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 67 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 68 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 69 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 70 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 71 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 72 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 73 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 74 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 75 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 76 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 77 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 78 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 79 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 80 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 81 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 82 Replace the system-board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly” on page 226. The FDDI adapter Fcode driver is not Bull may produce a compatible driver in the future, but does BA180100 not guarantee one. supported on this server.
  • Page 83 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 84 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 85 BA278007 Restart the blade server. firmware flash update The operating system’s server firmware Go to the Bull site at www.bull.com/support to download BA278009 update management tools are the latest version of the service aids package for Linux. incompatible with this system.
  • Page 86 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 87 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 88 Follow the suggested actions in the order in which they are listed in the Action column until the problem is • solved. If an action solves the problem, then you can stop performing the remaining actions. See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which •...
  • Page 89: Post Progress Codes (Checkpoints)

    2.4.2 POST progress codes (checkpoints) When you turn on the blade server, the power-on self-test (POST) performs a series of tests to check the operation of the blade server components. Use the management module to view progress codes that offer information about the stages involved in powering on and performing an initial program load (IPL).
  • Page 90: C1001F00 To C1645300 Checkpoints

    2.4.2.1 C1001F00 to C1645300 Service processor checkpoints The C1xx progress codes, or checkpoints, offer information about the initialization of both the service processor and the server. Service processor checkpoints are typical reference codes that occur during the initial program load (IPL) of the server. Table 15 lists the progress codes that might be displayed during the power-on self-test (POST), along with suggested actions to take if the system hangs on the progress code.
  • Page 91 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 92 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 93 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 94 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 95 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 96 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing there maining actions.
  • Page 97: C2001000 To C20082Ff Checkpoints

    2.4.2.2 C2001000 to C20082FF Virtual service processor checkpoints The C2xx progress codes indicate the progress of a partition IPL that is controlled by the virtual service processor. The virtual service processor progress codes end after the environment setup completes and the specific operating system code continues the IPL. The virtual service processor can start a variety of operating systems.
  • Page 98 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 99 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 100 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 101 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 102: C700Xxxx Server Firmware Ipl Status Checkpoints

    If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 103: Ca000000 To Ca2799Ff Checkpoints

    Table 2-17. C700xxxx Server firmware IPL status checkpoints Progress code Description Action 1. Shutdown and restart the blade server from the permanent-side image. 2. Check for updates to the system firmware. 3. Update the firmware. A problem has occurred with the system C700xxxx Checkout procedure 4.
  • Page 104 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 105 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 106 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 107 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 108 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 109 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 110 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 111 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 112 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 113 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 114 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 115 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 116 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 117 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 118 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 119: D1001Xxx To D1Xx3Fff Dump Codes

    2.4.2.6 D1001xxx to D1xx3FFF Service processor dump codes D1xx service processor dump status codes indicate the cage or node ID that the dump component is processing, the node from which the hardware data is collected, and a counter that increments each time that the dump processor stores 4K of dump data. Service processor dump status codes use the format, D1yy1xxx, where yy and xxx can be any number or letter.
  • Page 120 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 121 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 122 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 123 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 124: D1Xx3Y01 To D1Xx3Yf2 Checkpoints

    2.4.2.7 D1xx3y01 to D1xx3yF2 Service processor dump codes: These D1xx3yxx service processor dump codes use the format: D1xx3yzz, where xx indicates the cage or node ID that the dump component is processing, y increments from 0 to F to indicate that the system is not hung, and zz indicates the command being processed.
  • Page 125 If the system hangs on a progress code, follow the suggested actions in the order in which they are listed • in the Action column until the problem is solved. If an action solves the problem, you can stop performing theremaining actions.
  • Page 126: D1Xx900C To D1Xxc003 Checkpoints

    2.4.2.8 D1xx900C to D1xxC003 Service processor power-off checkpoints These D1xx service processor power-off status codes offer information about the status of the service processor during a power-off operation. lists the progress codes that might be displayed during the power-on self-test (POST), along with suggested actions to take if the system hangs on the progress code.
  • Page 127: Service Request Numbers (Srns)

    2.4.3 Service request numbers (SRNs) Service request numbers (SRNs) are error codes that the operating system generates. The codes have three digits, a hyphen, and three or four digits after the hyphen. SRNs can be viewed using the AIX diagnostics or the Linux service aid “diagela” if it is installed.
  • Page 128: 101-711 Through Ffc-725 Srns

    2.4.3.2 101-711 through FFC-725 SRNs AIX might generate service request numbers (SRNs) from 101-711 to FFC-725. Replace any parts in the order that the codes are listed in Table 2-22. Note: An x in the following SRNs represents a digit or character that might have any value. Table 2-22.
  • Page 129 Description and Action PCI adapter I/O bus problem. Go to “Performing the checkout procedure” on page 141. 111-78C Perform “Solving undetermined problems” on page 187. System does not perform a soft reset. Go to “Performing the checkout procedure” on page 111-999 Adapter configuration error.
  • Page 130 Description and Action Device bus termination power lost or not detected. 1. Check the Blade management-module event log. If an error was recorded by the system, see “POST progress codes (checkpoints)” on page 69. 252B-719 252B 2. Replace any parts reported by the diagnostic program. 3.
  • Page 131 2. There is unrestricted air flow around the system. 3. All system covers are closed. 4. Verify that all fans in the Bull Blade Chassis are operating correctly. Sensor indicates a FRU has failed. Use the failing function codes, use the physical location 651-159 code(s) from the diagnostic problem report screen to determine the FRUs.
  • Page 132 1. The room ambient temperature is within the system operating environment. 2. There is unrestricted air flow around the system. 651-162 3. There are no fan or blower failures in the Bull Blade Chassis. If the problem remains, check the management-module event log for possible causes of overheating.
  • Page 133 Description and Action 651-632 Internal device error. Go to “Performing the checkout procedure” on page 141. Error log analysis indicates an error detected by the I/O. Using the problem determination 651-639 procedure, failing function codes, and the physical location codes from the diagnostic problem report to determine the FRUs.
  • Page 134 Description and Action 651-722 System bus parity error. Go to “Performing the checkout procedure” on page 141. System bus protocol/transfer error. Go to “Performing the checkout procedure” on page 651-723 141. 651-724 I/O host bridge time-out error. Go to “Performing the checkout procedure” on page 141. I/O host bridge address/data parity error.
  • Page 135 Description and Action 651-781 2C7 214 Uncorrectable memory error. Go to “Performing the checkout procedure” on page 141. 651-784 302 214 Uncorrectable memory error. Go to “Performing the checkout procedure” on page 141. 651-785 303 214 Uncorrectable memory error. Go to “Performing the checkout procedure” on page 141. 651-786 304 214 Uncorrectable memory error.
  • Page 136 Description and Action Platform-specific error. Call your support center. 651-90x A non-critical error has been detected: uncorrectable memory or unsupported memory. Schedule deferred maintenance. Examine the memory modules and determine if they are 652-600 supported types. If the modules are supported, then replace the appropriate memory modules.
  • Page 137 Description and Action A non-critical error has been detected: intermediate or system bus time-out error. Schedule 652-772 2D2 292 deferred maintenance. Go to “Performing the checkout procedure” on page 141. A non-critical error has been detected: intermediate or system bus data parity error. 652-773 Schedule deferred maintenance.
  • Page 138 Description and Action External loopback fairness test failed. Go to “Performing the checkout procedure” on page 887-110 141. External loopback fairness and parity tests failed. Go to “Performing the checkout 887-111 procedure” on page 141. External loopback (twisted pair) test failed. Go to “Performing the checkout procedure” on 887-112 page 141.
  • Page 139 Description and Action Error log analysis indicates a permanent adapter failure. Go to “Performing the checkout 254E-604 procedure” on page 141. Error log analysis indicates permanent adapter failure is reported on the other port of this 254E-605 adapter. Go to “Performing the checkout procedure” on page 141. Error log analysis indicates adapter failure.
  • Page 140: Meaning Of The Last Character (X) After The Hyphen

    2.4.3.3 A00-FF0 through A24-xxx SRNs AIX might generate service request numbers (SRNs) from A00-FF0 to A24-xxx. Note: Some SRNs in this sequence might have 4 rather than 3 digits after the dash (–). Table 2-23 shows the meaning of an x in any of the following SRNs, such as A01-00x. Table 2-23.
  • Page 141 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST progress codes (checkpoints)” on page 69. A01-07x System bus parity error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 142 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST progress codes (checkpoints)” on page 69. A02-13x I/O Host Bridge address/data parity error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 143 Description FRU/action 1. Check the Blade management-module event log; if an Error log analysis indicates an environmental error was recorded by the system, see “POST progress codes (checkpoints)” on page 69. A05-00x and power warning, but the failure could not be isolated.
  • Page 144 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST System shutdown due to power fault with an progress codes (checkpoints)” on page 69. A05-14x unspecified cause. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 145 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST Service Processor error accessing Real Time progress codes (checkpoints)” on page 69. A0D-19x Clock/Time-of-Day Clock. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 146 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST The processor has been deconfigured. The progress codes (checkpoints)” on page 69. A10-210 system is operating in degraded mode. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 147 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST A non-critical error has been detected, an progress codes (checkpoints)” on page 69. A12-01x uncorrectable memory error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 148 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST A non-critical error has been detected, a I/O progress codes (checkpoints)” on page 69. A12-13x host bridge address/data parity error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 149 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST A non-critical error has been detected, a progress codes (checkpoints)” on page 69. A13-09x system bus data parity error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 150 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST progress codes (checkpoints)” on page 69. A15-07x Sensor indicates a power supply has failed. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 151 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST Power Fault specifically due to internal progress codes (checkpoints)” on page 69. A15-24x battery failure. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 152 Description FRU/action 1. Check the Blade management-module event log; if an A non-critical error has been detected, a error was recorded by the system, see “POST progress codes (checkpoints)” on page 69. A1D-13x service processor error accessing a thermal sensor. 2.
  • Page 153 Description FRU/action 1. Check the Blade management-module event log; if an error was recorded by the system, see “POST A non-critical error has been detected: progress codes (checkpoints)” on page 69. A1D-35x Mainstore or Cache IPL Diagnostic Error. 2. If no entry is found, replace the system-board and chassis assembly.
  • Page 154: Ssss-102 Through Ssss-640 Srns

    2.4.3.4 ssss-102 through ssss-640 SRNs for SCSI devices These service request numbers (SRNs) identify a SCSI device problem. Use Table 2-25 to identify an SRN when you suspect a SCSI device problem. Replace the parts in the order that the failing function codes (FFCs) are listed. Notes: •...
  • Page 155 The diagnostic test failed. 1. Check the Blade management-module event log. If an error was recorded by the system, see “POST progress codes (checkpoints)” on page 69. ssss-112 ssss 2. Replace any parts reported by the diagnostic program. 3. Replace the system board and chassis assembly, as described in “Replacing the Tier 2 system-board and chassis assembly”...
  • Page 156 The error log analysis indicates a hardware failure. 1. Check the Blade management-module event log. If an error was recorded by the 252B ssss system, see “POST progress codes (checkpoints)” on page 69. ssss-128 software 2. Replace any parts reported by the diagnostic program. 3.
  • Page 157: Failing Function Codes 151 Through 2D02

    The management-module event log is not reporting any system environmental warnings. 2. If the problem remains, call Bull support. Error log analysis indicates poor signal quality. 1. Check the Blade management-module event log. If an error was recorded by the 199 252B system, see “POST progress codes (checkpoints)”...
  • Page 158 Description and notes System-board and chassis assembly System-board and chassis assembly System-board and chassis assembly Ethernet network problem System-board and chassis assembly System-board and chassis assembly (Host – PCI bridge problem) System-board and chassis assembly (PCI – PCI bridge problem) System-board and chassis assembly (MPIC interrupt controller problem) PCI device or adapter problem.
  • Page 159: Error Logs

    The seventh word is the direct select address, which is 77777777 in the example. Table 2-27. Nine-word system reference code in the management-module event log Index Source Date/Time Text 01/21/2008, (ESCALA EL260B -BC1BLD5E) SYS F/W: Error. Replace UNKNOWN (5008FECF B7001111 22222222 33333333 Blade_05 17:15:14 44444444 55555555 66666666 77777777 88888888 99999999) Chapter 2. Diagnostics...
  • Page 160: Checkout Procedure

    Depending on your operating system and the utilities you have installed, error messages might also be stored in an operating system log. See the documentation that comes with the operating system for more information. See the Blade Management Module User’s Guide for more information about the event log. Checkout procedure The checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the blade server.
  • Page 161: Performing The Checkout Procedure

    2.6.2 Performing the checkout procedure Follow this procedure to perform the checkout. Step 001 Perform the following steps: Update the firmware to the current level, as described in “Updating the firmware” on page 229. You might also have to update the management module firmware. If you did not update the firmware for some reason, power off the blade server for 45 seconds before powering it back on.
  • Page 162 Step 004 Is the operating system Linux? Yes Record any information or messages that may be in the management module event log; then go to Step 007. If you cannot load the stand-alone Diagnostics CD, answer this question No. No Go to “Solving undetermined problems” on page 187. Step 005 Perform the following steps: Note:...
  • Page 163 This ends the AIX procedure. Step 007 Perform the following steps: Use the management-module Web interface to make sure that the device from which you load the stand-alone diagnostics is set as the first device in the blade server boot sequence.
  • Page 164: Verifying The Partition Configuration

    Verifying the partition configuration Perform this procedure if there is a configuration problem with the system or a logical partition. Check the processor and memory allocations of the system or the partition. Processor or memory resources that fail during system startup could cause the startup problem in the partition.
  • Page 165 Press the CD button on the front of the blade server to give it ownership of the Blade media tray. Using the management module Web interface, make sure that: The blade server firmware is at the latest version. − − SQL is enabled for the blade server.
  • Page 166: Starting Stand-Alone Diagnostics From A Nim Server

    2.8.3 Starting stand-alone diagnostics from a NIM server Perform this procedure to start the stand-alone diagnostics from a network installation management (NIM) server. Note: Refer to the Network Installation Management Guide and Reference for information about configuring the blade server as a NIM server client. Verify with the system administrator and systems users that the blade server can be shut down.
  • Page 167: Using The Diagnostics Program

    The Function Selection screen will display. See Using the diagnostics program” on page 147 for more information about running the diagnostics program. Note: If the Define Terminal screen is displayed, type the terminal type and press Enter. The use of “vs100” as the terminal type is recommended; however, the function keys (F#) may not work.
  • Page 168: Boot Problem Resolution

    Make sure that your boot list is correct. From the Blade management-module Web interface, display the boot sequences for the blade servers in your Bull Blade Chassis: Blade Tasks → Configuration → Boot Sequence. Find your blade server on the list that is displayed and make sure that the device from which you are attempting to boot is the first device in the boot sequence.
  • Page 169 Turn the blade server power off; then, turn it on and retry the boot operation. If the boot fails, try a known-good bootable CD. If possible, try to boot another blade server in the Bull Blade Chassis to verify that the CD or DVD drive is functional.
  • Page 170: Troubleshooting Tables

    Use the troubleshooting tables to find solutions to problems that have identifiable symptoms. If these symptoms relate to shared Bull Blade Chassis resources, see “Solving shared Blade resource problems” on page 181. If you cannot find the problem in these tables, see “Running the diagnostics program”...
  • Page 171: Diskette Drive Problems

    2.10.2 Diskette drive problems Identify diskette drive problem symptoms and what corrective actions to take. • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 172: Hard Disk Drive Problems

    2.10.4 Hard disk drive problems Identify hard disk problem symptoms and what corrective actions to take. • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 173: Keyboard Problems

    Disconnect the Bull Blade Chassis from all electrical sources, wait for 30 seconds, management module reports a reconnect the Bull Blade Chassis to the electrical sources, and restart the blade server. general monitor failure. If the problem remains, see “Solving undetermined problems” on page 187. Also view the Hardware Maintenance Manual and Troubleshooting Guide or Problem Determination and Service Guide for your Bull Blade Chassis.
  • Page 174: Memory Problems

    2.10.8 Memory problems Identify memory problem symptoms and what corrective actions to take. • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 175: Monitor Or Video Problems

    Manual and Troubleshooting Guide or Problem Determination and Service Guide for your Bull Blade Chassis. Only the cursor Make sure that the keyboard/video ownership on the Bull Blade Chassis has not been appears. switched to another blade server. If the problem remains, see “Solving undetermined problems” on page 187.
  • Page 176: Network Connection Problems

    Blade bays and are configured and operating correctly. See the network. Hardware Maintenance Manual and Troubleshooting Guide or Problem Determination and Service Guide for your Bull Blade Chassis for details. The settings in the I/O module are appropriate for the blade server (settings −...
  • Page 177: Pci Expansion Card (Piocard) Problem Isolation Procedure

    Nine-word system reference code in the management-module event log Index Source Date/Time Text 01/21/2008, (ESCALA EL260B -BC1BLD5E) SYS F/W: Error. Replace UNKNOWN (5008FECF B7001111 22222222 33333333 Blade_05 17:15:14 44444444 55555555 66666666 77777777 88888888 99999999) Depending on your operating system and the utilities you have installed, error messages might also be stored in an operating system log.
  • Page 178: Optional Device Problems

    1. Make sure that: An optional device that was just installed does not work. − The option is designed for the blade server. Contact your Bull support representative. You followed the installation instructions that came with the option. − The option is installed correctly.
  • Page 179 The blade server does not turn on. a. The power LED on the front of the Bull Blade Chassis is on. b. The LEDs on all the Blade power modules are on. c. The blade server is in a blade bay that is supported by the power modules installed in the Bull Blade Chassis.
  • Page 180: Power Hypervisor (Phyp) Problems

    3. Check the bus and I/O adapter allocations for the partition. Verify that the partition has load source and console I/O resources. 4. Check the IPL mode of the system or failing partition. 5. For further assistance, contact Bull Support. Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 181 ” on page 205. Supported DIMMs See “ ” on page 5 for more information. NEXTLVL Symbolic Contact Bull Support. 1. Collect the error log information. PIOCARD The hardware that controls PCI adapters Symbolic CRU 2. Get the DSA, which is word 7 of the associated B700xxxx SRC.
  • Page 182: Service Processor Problems

    2.10.16 Service processor problems The baseboard management controller (BMC) is a flexible service processor that provides error diagnostics with associated error codes, and fault isolation procedures for troubleshooting. When the advanced POWER6 service processor error analysis determines a specific fault, the service processor logs an error code to identify the failing component.
  • Page 183 Escala EL260B blade server shut down. indicates that the Blade encountered a 2. Resolve any problems. problem, and the 3. Remove the blade from the Bull Blade Chassis and then reinsert the blade blade server was server. automatically shut down as a result.
  • Page 184 2. If the problem persists, replace each memory DIMM, by following the action for symbolic FRU MEMDIMM. 3. Install the blade server into the Bull Blade Chassis after each DIMM replacement and restart the blade to verify if the problem is solved.
  • Page 185 “ assembly ” on page 226. FSPSP06 The service processor Contact Bull Support. reported a suspected intermittent problem. FSPSP07 The time of day has 1. Use the chdate command to set the VIOS date and time, using one of the...
  • Page 186 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 187 Collect the dump for support and power off and power on the blade server. 5. If an A1xx SRC has not remained more than 40 minutes, call Bull Support. FSPSP16 Save any error log Contact Bull Support.
  • Page 188 Array bit server ” on page 8. steering may be able 2. Remove the blade server from the Bull Blade Chassis and reinsert the to correct this problem blade server into the Bull Blade Chassis. without replacing Blade server control 3.
  • Page 189 Reason code A45F • 1. Set the enclosure feature code using SMS, which automatically resets the service processor. 2. If the problem persists, call Bull Support. If you do not see your reason code listed, call Bull Support. Chapter 2. Diagnostics...
  • Page 190 Action Procedure Code FSPSP34 The memory cards are Install a DIMM for each of the dual processors on the ESCALA EL260B plugged in an invalid blade server. Install the first pair in DIMM connectors 2 and 4. configuration and Look for the following error codes in order. Follow the procedure for the first cannot be used by the code you find.
  • Page 191 226. FSPSP48 A diagnostics function If the CRUs called out before this procedure do not fix the problem, Contact detects an external Bull Support. processor interface problem. FSPSP49 A diagnostic function If the CRUs called out before this procedure do not fix the problem, Contact detects an internal Bull Support.
  • Page 192 • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 193 2. Resolve any problems. reporting that 12V dc 3. Remove the blade from the Bull Blade Chassis and then reinsert the blade is not present on the server. Blade midplane. 4. Power on the blade server.
  • Page 194: Software Problems

    2.10.17 Software problems Use this information to recognize software problem symptoms and to take corrective actions. • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,” on page 189 to determine which components are CRUs and which components are FRUs.
  • Page 195: Universal Serial Bus (Usb) Port Problems

    2.10.18 Universal Serial Bus (USB) port problems This topic describes USB port problem symptoms and corrective actions. • Follow the suggested actions in the order in which they are listed in the Action column until the problem is solved. • See Chapter 3, “Parts listing,”...
  • Page 196: Figure 2-1. Light Path Diagnostic Leds

    Press and hold the light path diagnostics switch to relight the LEDs that were lit before you removed the blade server from the Bull Blade Chassis. The LEDs will remain lit for as long as you press the switch, to a maximum of 25 seconds.
  • Page 197: Light Path Diagnostics Leds

    Management card A system board error occurred. 1. Replace the blade server cover, reinsert the blade server error in the Bull Blade Chassis, and then restart the blade P1-C9 MGMT CRD server. 2. Check the management-module event log for information about the error.
  • Page 198: Isolating Firmware Problems

    System board error A system board and chassis 1. Replace the blade server cover, reinsert the blade server assembly error has occurred. A in the Bull Blade Chassis, and then restart the blade P1 SYS BRD microprocessor failure shows up as server.
  • Page 199: Recovering The System Firmware

    If your system hangs, access the management module and select Blade Tasks → Configuration → Boot Mode to show the Escala EL260B blade server in the list of blade servers in the Bull Blade Chassis. Click the appropriate blade server and select Permanent to force the system to start from the PERM image.
  • Page 200: Recovering The Temp Image From The Perm Image

    2.13.3 Recovering the TEMP image from the PERM image To recover the TEMP image from the PERM image, you must perform the reject function. The reject function copies the PERM image into the TEMP image. To perform the reject function, complete the following procedure. If you have not started the system from the PERM image, do so now.
  • Page 201: Committing The Temp System Firmware Image

    To check the general function of shared Blade resources, complete the following operations. Verify that the Bull Blade Chassis has the required power modules installed and is connected to a working power source. Verify that power management is set correctly for your Bull Blade Chassis configuration.
  • Page 202: Solving Shared Keyboard Problems

    Solving shared keyboard problems Problems with Blade shared resources might appear to be in the blade server, but might actually be a problem in a Bull Blade Chassis keyboard component. To check the general function of shared Blade keyboard resources, complete the following steps: Verify that the keyboard/video select button LED on the front of the blade server is lit.
  • Page 203: Solving Shared Media Tray Problems

    Problems with Blade shared resources might appear to be in the blade server, but might actually be a problem in a Bull Blade Chassis media tray component. To check the general function of shared Blade media tray resources, perform the following procedure.
  • Page 204 See the Problem Determination and Service Guide or the Hardware Maintenance Manual and Troubleshooting Guide for your Bull Blade Chassis. Some Bull Blade Chassis types have several management-module components that you might test or replace. See the Installation Guide for your management module for more information.
  • Page 205: Solving Shared Network Connection Problems

    Verify that the network cables are securely connected to the I/O module. Verify that the network cables are securely connected to the I/O module. Verify that the power configuration of the Bull Blade Chassis supports the I/O module configuration. Verify that the installation of the I/O-module type is supported by the Bull Blade Chassis and blade server hardware.
  • Page 206: Solving Shared Power Problems

    Verify that the LEDs on all the Blade power modules are lit. Verify that power is being supplied to the Bull Blade Chassis. Verify that the installation of the blade server type is supported by the Bull Blade Chassis. Verify that the power configuration of the Bull Blade Chassis supports the blade bay where your blade server is installed.
  • Page 207: Solving Undetermined Problems

    Bull Blade Chassis. • If all of the blade servers have the same symptom, it is probably a Bull Blade Chassis problem; for more information, See the Hardware Maintenance Manual and Troubleshooting Guide or Problem Determination and ServiceGuide for your Bull Blade Chassis.
  • Page 208 Module User’s Guide for more information. Turn off the blade server. Remove the blade server from the Bull Blade Chassis and remove the cover. Remove or disconnect the following devices, one at a time, until you find the failure. Reinstall, turn on, and reconfigure the blade server each time.
  • Page 209: Chapter 3. Parts Listing

    Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your • responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request Bull •...
  • Page 210: Parts Table

    42D8060 iSCSI TOE SFF expansion card 32R1926 QLogic GbE/4Gb Fibre Channel CFFh expansion card 39Y9304 Ethernet expansion card (CFFv) for Bull Blade (option) 39Y9308 QLogic 4Gb Fibre Channel CFFv expansion card 41Y8526 Emulex 4Gb Fibre Channel CFFv expansion card 43W6862...
  • Page 211: Chapter 4. Removing And Replacing Blade Server Components

    Replaceable components are of three types: Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your • responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. • Tier 2 customer replaceable unit: You may install a Tier 2 CRU yourself or request Bull to install it, at no additional charge, under the type of warranty service that is designated for your blade server.
  • Page 212: System Reliability Guidelines

    Verify that you are maintaining proper system cooling in the unit. • Do not operate the Bull Blade Chassis without a blade server, expansion unit, or filler blade installed in each blade bay. See the documentation for your Bull Blade Chassis for additional information.
  • Page 213: Returning A Device Or Component

    While the device is still in its static-protective package, touch it to an unpainted metal • part of the Bull Blade Chassis or any unpainted metal surface on any other grounded rack component in the rack you are installing the device in for at least 2 seconds. This drains static electricity from the package and from your body.
  • Page 214 To remove the blade server, complete the following steps: Read “Safety” on page vii and the “Installation guidelines” on page 191. If the blade server is operating, shut down the operating system. Press the power-control button (behind the control-panel door) to turn off the blade server.
  • Page 215: Installing The Blade Server In A Bull Blade Chassis

    Installing the blade server in a Bull Blade Chassis Install the blade server in a Bull Blade Chassis to use the blade server. Figure 4-2. Installing the blade server in a Bull Blade Chassis Statement 21 CAUTION: Hazardous energy is present when the blade server is connected to the power source.
  • Page 216 Bull Blade Chassis bezel. Important: Do not place the label on the blade server or in any way block the ventilation holes on the blade server. See the documentation that comes with your Bull Blade Chassis for information about label placement.
  • Page 217: Removing And Replacing Tier 1 Crus

    Removing and replacing Tier 1 CRUs Replacement of Tier 1 customer-replaceable units (CRUs) is your responsibility. If Bull installs a Tier 1 CRU at your request, you will be charged for the installation. The illustrations in this documentation might differ slightly from your hardware.
  • Page 218: Installing And Closing The Blade Server Cover

    Installing and closing the blade server cover Install and close the cover of the blade server before you insert the blade server into the Bull Blade Chassis. Do not attempt to override this important protection. Figure 4-4. Installing the cover...
  • Page 219: Removing The Bezel Assembly

    Install the blade server into the Bull Blade Chassis. See "Installing the blade server in a Bull Blade Chassis", on page 195. 4.4.3 Removing the bezel assembly Remove the bezel assembly. Figure 4-5. Removing the bezel assembly Read “Safety” on page vii and the “Installation guidelines” on page 191.
  • Page 220: Installing The Bezel Assembly

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis. See Installing the blade server in a Bull Blade Chassis, on page 195.
  • Page 221: Removing A Sas Hard Disk Drive

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the lade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 222: Installing A Sas Hard Disk Drive

    SAS hard disk drive. These two SAS hard disk drives can be used to implement and manage a redundant array of independent disks (RAID) level-1 array. See "Configuring a SAS RAID array" in the Escala EL260B Installation and User's Guide for information about SAS RAID configuration.
  • Page 223 Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the lade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 224: Removing A Memory Module

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. 8. Install the blade server into the Bull Blade Chassis. See "Installing the blade server in a Bull Blade Chassis", on page 195.
  • Page 225: Installing A Memory Module

    Read the documentation that comes with the DIMMs. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 226 Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. 12. Install the blade server into the Bull Blade Chassis. See Installing the blade server in a Bull Blade Chassis, on page 195.
  • Page 227: Removing The Management Card

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 228: Installing The Management Card

    10 for the location. Touch the static-protective package that contains the management card to any unpainted metal surface on the Bull Blade Chassis or any unpainted metal surface on any other grounded rack component; then, remove the management card from its package.
  • Page 229: Entering Vital Product Data

    The management card contains the vital product data (VPD) for the service processor. Bull sets the correct VPD values for a new Escala EL260B blade server. If you order a replacement management card, the replacement part is not configured. If you install the...
  • Page 230: Escala El260B Vital Product Data

    Refer to the VPD information that you recorded when you installed the Escala EL260B blade server, as described in the introduction of the Installation and User’s Guide. To determine the values for your Escala EL260B blade server, use the management module and the lsvpd command.
  • Page 231: Removing And Installing An I/O Expansion Card

    After you enter the VPD values, the blade server powers down the first partition and reboots the service processor. Start the Escala EL260B blade server to continue using the blade server with the new management card. 4.4.12 Removing and installing an I/O expansion card Add an I/O expansion card to the blade server to provide additional connections for communicating on a network.
  • Page 232: Figure 4-13. Removing A Small Form Factor (Sff) Expansion Card

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 233: Figure 4-14. Installing A Small-Form-Factor Expansion Card

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 234: Figure 4-15. Removing A Standard-Form-Factor Expansion Card

    Install the blade server into the Bull Blade Chassis. See "Installing the blade server in a Bull Blade Chassis", on page 195. Use the documentation that comes with the expansion card to install device drivers and to perform any configuration that the expansion card requires.
  • Page 235: Figure 4-16. Installing A Standard-Form-Factor Expansion Card

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 236: Figure 4-17. Removing A Combination-Form-Factor Expansion Card

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. 10. Install the blade server into the Bull Blade Chassis. See “Installing the blade server in a Bull Blade Chassis” on page 195.
  • Page 237: Figure 4-18. Installing A Combination-Form-Factor Expansion Card

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 238 Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis. See “Installing the blade server in a Bull Blade Chassis” on page 195.
  • Page 239: Removing The Battery

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 240: Installing The Battery

    4.4.14 Installing the battery You can install the battery. Figure 4-20. Installing the battery The following notes describe information that you must consider when replacing the battery in the blade server. When replacing the battery, you must replace it with a lithium battery of the same type •...
  • Page 241 Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis. See “Installing the blade server in a Bull Blade Chassis” on page 195.
  • Page 242: Removing The Hard Disk Drive Tray

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 243: Installing The Hard Disk Drive Tray

    4.4.16 Installing the hard disk drive tray You can install the hard disk drive tray. Figure 4-22. Installing the hard disk drive tray To install the hard disk drive tray, complete the following steps: Place the drive tray into position on the system board and install the four screws to secure it.
  • Page 244: Removing The Expansion Bracket

    Install the blade server into the Bull Blade Chassis. See “Installing the blade server in a Bull Blade Chassis” on page 195. 4.4.17 Removing the expansion bracket You can remove the expansion bracket. Figure 4-23. Removing the expansion bracket To remove the expansion bracket, complete the following steps: Read “Safety”...
  • Page 245: Installing The Expansion Bracket

    Hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. Install the blade server into the Bull Blade Chassis. See “Installing the blade server in a Bull Blade Chassis” on page 195.
  • Page 246: Replacing The Tier 2 System-Board And Chassis Assembly

    Read “Safety” on page vii and the “Installation guidelines” on page 191. Shut down the operating system, turn off the blade server, and remove the blade server from the Bull Blade Chassis. See “Removing the blade server from a Bull Blade Chassis” on page 193.
  • Page 247 Completing the information on the RID tag ensures future entitlement for service. 12. Place the RID tag on the bottom of the blade server chassis. 13. Install the blade server into the Bull Blade Chassis. See “Installing the blade server in a Bull Blade Chassis” on page 195.
  • Page 248 “ ” on page 209 Refer to the information that you recorded when you installed the Escala EL260B blade server, as described in the introduction of the Installation and User’s Guide. 16. Reset the system date and time through the operating system that you installed.
  • Page 249: Chapter 5. Configuring

    To avoid problems and to maintain proper system performance, always verify that the blade server BIOS, service processor, and diagnostic firmware levels are consistent for all blade servers within the Bull Blade Chassis. See “Verifying the system firmware levels” on page 180 for more information.
  • Page 250: Configuring The Blade Server

    Type ls /tmp/fwupdate to identify the name of the firmware. The result of the command lists any firmware updates that you downloaded to the directory, such as the following update, for example: 01EA3xx_yyy_zzz Install the firmware update with one of the following methods: Install the firmware with the in-band diagnostics of your AIX system, as described −...
  • Page 251: Using The Sms Utility

    Using the SMS utility Use the System Management Services (SMS) utility to perform a variety of configuration tasks on the Escala EL260B blade server. 5.3.1 Starting the SMS utility Start the SMS utility to configure the blade server.
  • Page 252: Creating A Ce Login

    Ethernet local area network (LAN). The routing from an Ethernet controller to an I/O-module bay varies according to the blade server type, the Bull Blade Chassis, and the operating system that is installed. For example, each Ethernet controller on the...
  • Page 253: Blade Server Ethernet Controller Enumeration

    The Ethernet controllers in your blade server support failover, which provides automatic redundancy for the Ethernet controllers. Failover capabilities vary per Bull Blade Chassis. Without failover, only one Ethernet controller can be connected from each server to each virtual LAN or subnet. With failover, you can configure more than one Ethernet controller from each server to attach to the same virtual LAN or subnet.
  • Page 254: Mac Addresses For Host Ethernet Adapters

    I/O-module bays 3 and 4, if these bays are supported by your Bull Blade Chassis. You can verify which controller on the card is routed to which I/O-module bay by performing the same test and using a controller on the expansion card and a compatible switch module or pass-thru module in I/O-module bay 3 or 4.
  • Page 255 MAC +17 00:1A:64:44:0ec8 to Logical HEA port Same as last MAC 00:1A:64:44:0ed5 address on the label For more information about planning, deploying, and managing the use of host Ethernet adapters, see the Concepts for virtual networking section of the VIOS chapter in the System p Advanced POWER Virtualization Operations Guide.
  • Page 256 Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 257: Appendix A. Getting Help And Technical Assistance

    Bull provides a wide variety of sources to assist you. This appendix indicates where to go for additional information about Bull and Bull products, what to do if you experience a problem with your Bull Blade system, and who to call for service if necessary.
  • Page 258 Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 259: Appendix B. Notices

    These products are offered and warranted solely by third parties. Bull makes no representations or warranties with respect to non-Bull products. Support (if any) for the non-Bull products is provided by the third party, not Bull.
  • Page 260: Product Recycling And Disposal

    WEEE. Customer participation is important to minimize any potential effects of EEE on the environment and human health due to the potential presence of hazardous substances in EEE. For proper collection and treatment, contact your local Bull representative. Escala Blade EL260B - Problem Determination and Service Guide...
  • Page 261: Electronic Emission Notices

    Properly shielded and grounded cables and connectors must be used in order to meet FCC emission limits. Bull is not responsible for any radio or television interference caused by using other than recommended cables and connectors or by unauthorized changes or modifications to this equipment.
  • Page 262: European Union Emc Directive Conformance Statement

    This product is in conformity with the protection requirements of EU Council Directive 89/336/EEC on the approximation of the laws of the Member States relating to electromagnetic compatibility. Bull cannot accept responsibility for any failure to satisfy the protection requirements resulting from a non-recommended modification of the product, including the fitting of non-Bull option cards.
  • Page 264 BULL CEDOC 357 AVENUE PATTON B.P.20845 49008 ANGERS CEDEX 01 FRANCE REFERENCE 86 A1 36FA 00...

Table of Contents