Page 1
Sun Fire X4640 Server Diagnostics Guide Part No: 821–0472 December 2010, Rev A...
Page 2
émanant de tiers. Oracle Corporation et ses affiliés déclinent toute responsabilité ou garantie expresse quant aux contenus, produits ou services émanant de tiers. En aucun cas, Oracle Corporation et ses affiliés ne sauraient être tenus pour responsables des pertes subies, des coûts occasionnés ou des dommages causés par l’accès à...
Contents Using This Documentation ........................5 Product Downloads ........................5 About This Documentation (PDF and HTML) ................6 We Welcome Your Comments .....................6 Change History ..........................6 Overview of the Diagnostics Guide ......................7 Introduction to System Diagnostics .....................9 Troubleshooting Options ......................9 Diagnostic Tools .......................... 10 Troubleshooting the Server ........................
Page 4
Resetting the SP ........................... 41 How to Reset the ILOM SP Using the Web Interface .............. 41 How to Reset the ILOM SP Using the Command-Line Interface .......... 42 Index ..............................43 Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
In the Patch Search box, click Product or Family (Advanced Search). In the Product field, type a full or partial product name, for example, Sun Fire X4640 until a list of matches is displayed and select the product of interest.
In the Patches Search box, click Search. A list of product downloads (listed as patches) appears. Select the Patch name of interest, for example, 12980209, for the Sun Fire X4640 1.3.1 Firmware. In the right-side pane that appears, click Download.
Overview of the Diagnostics Guide The following topics are covered in this document. Description Link Learn about troubleshooting procedures and “Introduction to System Diagnostics” on page 9 diagnostics tools available for the server. Troubleshoot system problems. “Troubleshooting the Server” on page 11 Troubleshoot DIMM problems.
Introduction to System Diagnostics This section contains an introduction to Oracle's Sun Fire X4640 server diagnostics and covers the following topics: “Troubleshooting Options” on page 9 ■ “Diagnostic Tools” on page 10 ■ Troubleshooting Options The following table lists the suggested order of troubleshooting procedures when you have an issue with the server.
Diagnostic Tools Diagnostic Tools The following diagnostic tools are available for the Sun Fire X4640 server. BIOS/POST From the point that the host subsystem is powered on and begins executing code, BIOS code is executed. The sequence that BIOS goes through, from the first point where code is executed to the point that the operating system booting begins, is referred to as POST (power-on self-test).
Troubleshooting the Server This section covers the following procedures: “How to Gather Service Visit Information” on page 11 ■ “How to Troubleshoot Power Problems” on page 11 ■ “How to Inspect the Outside of the Server” on page 12 ■ “How to Inspect the Inside of the Server”...
Power button for four seconds to force main power off and enter standby power mode. When main power is off, the Power/OK LED on the front panel will begin flashing, indicating that the server is in standby power mode. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Page 13
Remove the server cover, as required. For instructions on removing the server cover, refer to the Sun Fire X4640 Server Service Manual. Inspect the internal status indicator LEDs, which can indicate component malfunction.
Page 14
If the problem with the server is not evident, you can try viewing the power-on self test (POST) messages and BIOS event logs during system startup. Refer to Sun Fire X4640 Server Service Manual for more information on POST and BIOS event logs. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
■ DIMM Fault LEDs In the Sun Fire X4640 servers, eight DIMM slots are on each removable CPU module. The DIMM fault LEDs in the DIMM slot ejector levers indicate which DIMM pair has failed. These DIMM fault LEDs can be lit for up to one minute by a capacitor on the CPU module, even after...
Page 16
To light the fault LED from the capacitor, push the small button on the CPU module labelled “FAULT REMIND BUTTON. ” The DIMM ejector levers contain LEDs that can indicate a faulty DIMM. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
DIMM fault LED is on (amber)– At least one of the DIMMs in this DIMM pair is faulty and ■ should be replaced. DIMM Population Rules Sun Fire X4640 Server Service Manual for the DIMM population rules. ▼ How to Isolate and Correct DIMM ECC Errors If the ILOM reports an ECC error or a problem with a DIMM, first complete the steps in the following procedure.
Page 18
DIMM. Instead, it might be caused by CPU0 or by the DIMM slot. Continue with the rest of the procedure. Shut down the server again and disconnect the AC power cords. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Remove the CPU module that has the DIMM problem, and remove another CPU module that does not indicate a DIMM problem. Refer to the Sun Fire X4640 Server Service Manual. Remove both DIMMs of the pair and install them into paired slots on the second CPU module that did not indicate a DIMM problem.
Page 20
Run AMD Machine Check Analysis Tool (MCAT) using the saved log, to find the potential whereabouts of a faulty DIMM. The MCAT utility is available as pare of the Windows supplemental software from Note – the Tools and Drivers CD/DVD for your server. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Identifying BIOS DIMM Error Messages Identifying BIOS DIMM Error Messages The system BIOS displays and logs four types of DIMM error messages on system screen and in ILOM's IPMI SEL. The ILOM SEL format is as follows Event# | Date | Time | Memory #0x(error type) | Configuration Error | CPU Y DIMM Z Where Y represents the processor socket that the DIMM is associated with and Z is the DIMM socket that displays the error.
Ambient temperature sensors and core temperature sensors on CPU boards are fed to IPMI stack to adjust fan speed. View the sensor information in Sun ILOM 3.0 Supplement for the Sun Fire X4640 Server for more information about the sensors. This section contains the following procedures: “How to Use the ILOM Web Interface to View the Sensor Readings”...
Page 24
If the server is powered off, many components will have no readings. Note – In the Sensor Readings page, do the following: a. Locate the name of the sensor you want to view. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Page 25
For specific details about the type of discrete sensor targets you can access, as well as the paths to access them, see Sun ILOM 3.0 Supplement for the Sun Fire X4640 Server. If the problem with the server is not evident after viewing sensor readings information, continue with “Using SunVTS Diagnostics Software ”on page...
Log in to the SP as Administrator or Operator to reach the ILOM web interface: a. Type the IP address of the server’s SP into your web browser. The Sun Integrated Lights Out Manager Login screen appears. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Page 27
Viewing the ILOM System Event Log b. Type your user name and password. When you first try to access the ILOM SP, you are prompted to type the default user name and password: Default user name: root Default password: changeme From the System Monitoring tab, select Event Logs.
Page 28
Before You Begin Establish a local serial console connection or SSH connection to the server SP. Sun ILOM 3.0 Supplement for the Sun Fire X4640 Server for more information. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Page 29
Viewing the ILOM System Event Log Type the following command to set the working directory: -> cd /SP/logs/event Type the following command to display the event log list: ->show list The contents of the event log appear. For example: -> show list /SP/logs/event/list Targets: Properties:...
How to Clear Faults From the System Event Log Using the ILOM Web Interface Navigate to the Event Log from the ILOM System Management tab. Click the Clear Event Log button on the bottom of the Event Log page Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Interpreting Event Log Time Stamps A confirmation dialog appears. Click OK to clear the entries. ▼ How to Clear Faults From the System Event Log Using the ILOM Command-Line Interface Type the following command: cd /SP/logs/event/ set clear=true A confirmation message appears. Type one of the following: To clear the entries, type: y ■...
Page 32
BIOS or user. NTP servers provide UTC time. Therefore, if NTP is enabled on the SP, the SP clock is in UTC. Through the CLI, ILOM web interface, and IPMI ■ Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Using SunVTS Diagnostics Software SunVTS is the Sun Validation Test Suite, which provides a comprehensive diagnostic tool that tests and validates Sun hardware by verifying the connectivity and functionality of most hardware controllers and devices on Sun platforms. This section contains the following procedures: “Introduction to SunVTS Diagnostic Test Suite”...
In the SunVTS GUI, press Enter or click the Start button when you are prompted to start the tests. The test suite runs until it encounters an error or the test is completed. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Page 35
SunVTS Documentation The CD takes approximately nine minutes to boot. Note – When the test is completed, review the log files generated during the test. SunVTS software provides access to four different log files: SunVTS test error log: contains time-stamped SunVTS test error messages. The log file ■...
Page 36
When you use the Bootable Diagnostics CD, the server boots from the CD. Therefore, the test log files are not on the server's hard disk drive and they will be deleted when you power cycle the server. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Creating a Data Collector Snapshot The purpose of the ILOM Service Snapshot utility is to collect data for use by Sun Services personnel to diagnose system problems. Customers should not run this utility unless requested to do so by Sun Services. This section contains the following procedures: “How To Create a Snapshot With the ILOM Web Interface”...
Page 38
(Optional) Check the Enabled to collect only log files from the data set. (Optional) Check Enabled check box to encrypt the output file. Select one of the following methods to transfer the output file: Browser ■ SFTP ■ ■ Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Creating a Data Collector Snapshot Click Run. A Save As dialog box appears. In the dialog box, specify the directory to which to save the file and the file name. Click OK. The file is saved to the specified directory. ▼...
Page 40
URI as follows: ftp://joe:mypasswd@host_ip_address/data The directory data is relative to the user's login, so the directory would probably be /home/joe/data. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
Resetting the SP If you need to reset your ILOM service processor (SP), you can do so without affecting the host OS. However, resetting an SP disconnects your current ILOM session and renders the SP unmanageable during reset. This section contains the following procedures: “How to Reset the ILOM SP Using the Web Interface”...
After updating the ILOM/BIOS firmware, you must reset the ILOM SP. ■ Log in to the ILOM CLI. Type the following command: -> reset /SP The ILOM reboots. The command line interface is unavailable while the ILOM reboots. Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...
BIOS DIMM errors, 21 externally inspecting the server, 12 BIOS/POST, 10 fan sensor readings, 23–32 clearing faults finding your product on My Oracle Support with the ILOM command-line interface, 31 (support.oracle.com), 5–6 with the ILOM web interface, 30–31 correctable DIMM errors, 19 correcting DIMM errors, 17–19...
Page 44
ILOM web interface, 41 sensor readings, 23–32 using the ILOM command-line interface, 25–26 using the ILOM web interface, 23–25 Service Processor ILOM, description, 10 service visit information, gathering, 11 Sun Fire X4640 Server Diagnostics Guide • December 2010, Rev A...