Advertisement

Quick Links

Sun Fire X4640 Server Diagnostics Guide
Part No: 821–0472
December 2010, Rev A

Advertisement

Table of Contents
loading

Summary of Contents for Oracle SUN Fire X4640

  • Page 1 Sun Fire X4640 Server Diagnostics Guide Part No: 821–0472 December 2010, Rev A...
  • Page 2 émanant de tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des dommages causés par l’accès à...
  • Page 3: Table Of Contents

    Contents Using This Documentation ........................5 Product Downloads ........................5 About This Documentation (PDF and HTML) ................6 We Welcome Your Comments .....................6 Change History ..........................6 Overview of the Diagnostics Guide ......................7 Introduction to System Diagnostics .....................9 Troubleshooting Options ......................9 Diagnostic Tools .......................... 10 Troubleshooting the Server ........................
  • Page 4 Resetting the SP ........................... 41 How to Reset the ILOM SP Using the Web Interface .............. 41 How to Reset the ILOM SP Using the Command-Line Interface .......... 42 Index ..............................43 Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 5: Using This Documentation

    In the Patch Search box, click Product or Family (Advanced Search). In the Product field, type a full or partial product name, for example, Sun Fire X4640 until a list of matches is displayed and select the product of interest.
  • Page 6: About This Documentation (Pdf And Html)

    In the Patches Search box, click Search. A list of product downloads (listed as patches) appears. Select the Patch name of interest, for example, 12980209, for the Sun Fire X4640 1.3.1 Firmware. In the right-side pane that appears, click Download.
  • Page 7: Overview Of The Diagnostics Guide

    Overview of the Diagnostics Guide The following topics are covered in this document. Description Link Learn about troubleshooting procedures and “Introduction to System Diagnostics” on page 9 diagnostics tools available for the server. Troubleshoot system problems. “Troubleshooting the Server” on page 11 Troubleshoot DIMM problems.
  • Page 9: Introduction To System Diagnostics

    Introduction to System Diagnostics This section contains an introduction to Oracle's Sun Fire X4640 server diagnostics and covers the following topics: “Troubleshooting Options” on page 9 ■ “Diagnostic Tools” on page 10 ■ Troubleshooting Options The following table lists the suggested order of troubleshooting procedures when you have an issue with the server.
  • Page 10: Diagnostic Tools

    Diagnostic Tools Diagnostic Tools The following diagnostic tools are available for the Sun Fire X4640 server. BIOS/POST From the point that the host subsystem is powered on and begins executing code, BIOS code is executed. The sequence that BIOS goes through, from the first point where code is executed to the point that the operating system booting begins, is referred to as POST (power-on self-test).
  • Page 11: Troubleshooting The Server

    Troubleshooting the Server This section covers the following procedures: “How to Gather Service Visit Information” on page 11 ■ “How to Troubleshoot Power Problems” on page 11 ■ “How to Inspect the Outside of the Server” on page 12 ■ “How to Inspect the Inside of the Server”...
  • Page 12: How To Inspect The Outside Of The Server

    Power button for four seconds to force main power off and enter standby power mode. When main power is off, the Power/OK LED on the front panel will begin flashing, indicating that the server is in standby power mode. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 13 Remove the server cover, as required. For instructions on removing the server cover, refer to the Sun Fire X4640 Server Service Manual. Inspect the internal status indicator LEDs, which can indicate component malfunction.
  • Page 14 If the problem with the server is not evident, you can try viewing the power-on self test (POST) messages and BIOS event logs during system startup. Refer to Sun Fire X4640 Server Service Manual for more information on POST and BIOS event logs. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 15: Troubleshooting Dimm Problems

    ■ DIMM Fault LEDs In the Sun Fire X4640 servers, eight DIMM slots are on each removable CPU module. The DIMM fault LEDs in the DIMM slot ejector levers indicate which DIMM pair has failed. These DIMM fault LEDs can be lit for up to one minute by a capacitor on the CPU module, even after...
  • Page 16 To light the fault LED from the capacitor, push the small button on the CPU module labelled “FAULT REMIND BUTTON. ” The DIMM ejector levers contain LEDs that can indicate a faulty DIMM. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 17: Dimm Population Rules

    DIMM fault LED is on (amber)– At least one of the DIMMs in this DIMM pair is faulty and ■ should be replaced. DIMM Population Rules Sun Fire X4640 Server Service Manual for the DIMM population rules. ▼ How to Isolate and Correct DIMM ECC Errors If the ILOM reports an ECC error or a problem with a DIMM, first complete the steps in the following procedure.
  • Page 18 DIMM. Instead, it might be caused by CPU0 or by the DIMM slot. Continue with the rest of the procedure. Shut down the server again and disconnect the AC power cords. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 19: Identifying Correctable Dimm Errors (Ces)

    Remove the CPU module that has the DIMM problem, and remove another CPU module that does not indicate a DIMM problem. Refer to the Sun Fire X4640 Server Service Manual. Remove both DIMMs of the pair and install them into paired slots on the second CPU module that did not indicate a DIMM problem.
  • Page 20 Run AMD Machine Check Analysis Tool (MCAT) using the saved log, to find the potential whereabouts of a faulty DIMM. The MCAT utility is available as pare of the Windows supplemental software from Note – the Tools and Drivers CD/DVD for your server. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 21: Identifying Bios Dimm Error Messages

    Identifying BIOS DIMM Error Messages Identifying BIOS DIMM Error Messages The system BIOS displays and logs four types of DIMM error messages on system screen and in ILOM's IPMI SEL. The ILOM SEL format is as follows Event# | Date | Time | Memory #0x(error type) | Configuration Error | CPU Y DIMM Z Where Y represents the processor socket that the DIMM is associated with and Z is the DIMM socket that displays the error.
  • Page 23: Using The Ilom To Monitor The Host

    Ambient temperature sensors and core temperature sensors on CPU boards are fed to IPMI stack to adjust fan speed. View the sensor information in Sun ILOM 3.0 Supplement for the Sun Fire X4640 Server for more information about the sensors. This section contains the following procedures: “How to Use the ILOM Web Interface to View the Sensor Readings”...
  • Page 24 If the server is powered off, many components will have no readings. Note – In the Sensor Readings page, do the following: a. Locate the name of the sensor you want to view. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 25 For specific details about the type of discrete sensor targets you can access, as well as the paths to access them, see Sun ILOM 3.0 Supplement for the Sun Fire X4640 Server. If the problem with the server is not evident after viewing sensor readings information, continue with “Using SunVTS Diagnostics Software ”on page...
  • Page 26: Viewing The Ilom System Event Log

    Log in to the SP as Administrator or Operator to reach the ILOM web interface: a. Type the IP address of the server’s SP into your web browser. The Sun Integrated Lights Out Manager Login screen appears. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 27 Viewing the ILOM System Event Log b. Type your user name and password. When you first try to access the ILOM SP, you are prompted to type the default user name and password: Default user name: root Default password: changeme From the System Monitoring tab, select Event Logs.
  • Page 28 Before You Begin Establish a local serial console connection or SSH connection to the server SP. Sun ILOM 3.0 Supplement for the Sun Fire X4640 Server for more information. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 29 Viewing the ILOM System Event Log Type the following command to set the working directory: -> cd /SP/logs/event Type the following command to display the event log list: ->show list The contents of the event log appear. For example: -> show list /SP/logs/event/list Targets: Properties:...
  • Page 30: Clearing The Faults From The System Event Log

    How to Clear Faults From the System Event Log Using the ILOM Web Interface Navigate to the Event Log from the ILOM System Management tab. Click the Clear Event Log button on the bottom of the Event Log page Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 31: Interpreting Event Log Time Stamps

    Interpreting Event Log Time Stamps A confirmation dialog appears. Click OK to clear the entries. ▼ How to Clear Faults From the System Event Log Using the ILOM Command-Line Interface Type the following command: cd /SP/logs/event/ set clear=true A confirmation message appears. Type one of the following: To clear the entries, type: y ■...
  • Page 32 BIOS or user. NTP servers provide UTC time. Therefore, if NTP is enabled on the SP, the SP clock is in UTC. Through the CLI, ILOM web interface, and IPMI ■ Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 33: Using Sunvts Diagnostics Software

    Using SunVTS Diagnostics Software SunVTS is the Sun Validation Test Suite, which provides a comprehensive diagnostic tool that tests and validates Sun hardware by verifying the connectivity and functionality of most hardware controllers and devices on Sun platforms. This section contains the following procedures: “Introduction to SunVTS Diagnostic Test Suite”...
  • Page 34: Sunvts Documentation

    In the SunVTS GUI, press Enter or click the Start button when you are prompted to start the tests. The test suite runs until it encounters an error or the test is completed. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 35 SunVTS Documentation The CD takes approximately nine minutes to boot. Note – When the test is completed, review the log files generated during the test. SunVTS software provides access to four different log files: SunVTS test error log: contains time-stamped SunVTS test error messages. The log file ■...
  • Page 36 When you use the Bootable Diagnostics CD, the server boots from the CD. Therefore, the test log files are not on the server's hard disk drive and they will be deleted when you power cycle the server. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 37: Creating A Data Collector Snapshot

    Creating a Data Collector Snapshot The purpose of the ILOM Service Snapshot utility is to collect data for use by Sun Services personnel to diagnose system problems. Customers should not run this utility unless requested to do so by Sun Services. This section contains the following procedures: “How To Create a Snapshot With the ILOM Web Interface”...
  • Page 38 (Optional) Check the Enabled to collect only log files from the data set. (Optional) Check Enabled check box to encrypt the output file. Select one of the following methods to transfer the output file: Browser ■ SFTP ■ ■ Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 39: How To Create A Snapshot With The Ilom Command-Line Interface

    Creating a Data Collector Snapshot Click Run. A Save As dialog box appears. In the dialog box, specify the directory to which to save the file and the file name. Click OK. The file is saved to the specified directory. ▼...
  • Page 40 URI as follows: ftp://joe:mypasswd@host_ip_address/data The directory data is relative to the user's login, so the directory would probably be /home/joe/data. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 41: Resetting The Sp

    Resetting the SP If you need to reset your ILOM service processor (SP), you can do so without affecting the host OS. However, resetting an SP disconnects your current ILOM session and renders the SP unmanageable during reset. This section contains the following procedures: “How to Reset the ILOM SP Using the Web Interface”...
  • Page 42: How To Reset The Ilom Sp Using The Command-Line Interface

    After updating the ILOM/BIOS firmware, you must reset the ILOM SP. ■ Log in to the ILOM CLI. Type the following command: -> reset /SP The ILOM reboots. The command line interface is unavailable while the ILOM reboots. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
  • Page 43: Index

    BIOS DIMM errors, 21 externally inspecting the server, 12 BIOS/POST, 10 fan sensor readings, 23–32 clearing faults finding your product on My Oracle Support with the ILOM command-line interface, 31 (support.oracle.com), 5–6 with the ILOM web interface, 30–31 correctable DIMM errors, 19 correcting DIMM errors, 17–19...
  • Page 44 ILOM web interface, 41 sensor readings, 23–32 using the ILOM command-line interface, 25–26 using the ILOM web interface, 23–25 Service Processor ILOM, description, 10 service visit information, gathering, 11 Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...

Table of Contents