Sun Fire T1000 Server ™ Service Manual Sun Microsystems, Inc. www.sun.com Part No. 819-3248-10 January 2006, Revision A Submit comments about this document at: http://www.sun.com/hwdocs/feedback...
Page 2
Copyright 2006 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, Californie 95054, Etats-Unis. Tous droits réservés. Sun Microsystems, Inc. a les droits de propriété intellectuels relatants à la technologie qui est décrit dans ce document. En particulier, et sans la limitation, ces droits de propriété...
Chassis Identification 6 Additional Service Related Information 7 Sun Fire T1000 Server Diagnostics 9 Overview of Sun Fire T1000 Server Diagnostics 9 Using LEDs to Identify the State of Devices 14 Front and Rear Panel LEDs 16 Power Supply LEDs 17...
Page 4
Running ALOM Service-Related Commands 19 Connecting to ALOM 19 Switching Between the System Console and ALOM 20 Service-Related ALOM Commands 20 ▼ To Run the showfaults Command 21 ▼ To Run the showenvironment Command 22 ▼ To Run the showfru Command 24 Running POST 27 Controlling How POST Runs 27 ▼...
Page 5
For further information, refer to the documents that accompany the SunVTS software 49 Removing and Replacing FRUs 51 Safety Information 51 Safety Symbols 52 Electrostatic Discharge Safety 52 Use an Antistatic Wrist Strap 53 Use an Antistatic Mat 53 Common Procedures for Parts Replacement 53 Required Tools 53 ▼...
Page 6
▼ To Replace the Clock Battery on the Motherboard 71 Common Procedures for Finishing Up 72 ▼ To Replace the Top Cover 72 ▼ To Reinstall the Server Chassis in the Rack 73 ▼ To Apply Power to the Server 73 Field-Replaceable Units (FRUs) 75 Sun Fire T2000 Server Service Manual •...
Preface The Sun Fire T1000 Service Manual provides information to aid in troubleshooting problems with and replacing components within the Sun Fire™ T1000 server. This manual is written for technicians, service personnel, and system administrators who service and repair computer systems. The person qualified to use this manual: Can open a system chassis, identify, and replace internal components.
Solaris Handbook for Sun Peripherals ■ AnswerBook2 ■ Other software documentation that you received with your system ■ viii Sun Fire T1000 Server Service Manual • January 2006 ™ online documentation for the Solaris ® commands and ™ operating environment...
C shell C shell superuser Bourne shell and Korn shell Bourne shell and Korn shell superuser Sun Fire T1000 Server Documentation You can view and print the following documents from the Sun documentation web Examples Edit your.login file. Use ls -a to list all files.
Sun does not endorse and is not responsible or liable for any content, advertising, products, or other materials that are available on or through such sites x Sun Fire T1000 Server Service Manual • January 2006 Description Site planning information for the...
Sun is interested in improving its documentation and welcomes your comments and suggestions. You can submit your comments by going to: http://www.sun.com/hwdocs/feedback Please include the title and part number of your document with your feedback: Sun Fire T1000 Server Service Manual, part number 819-3248-10 Preface...
Page 12
Sun Fire T1000 Server Service Manual • January 2006...
C H A P T E R Sun Fire T1000 Server Overview This chapter provides an overview of the features of the Sun Fire T1000 server. The following topics are covered: “Sun Fire T1000 Server Features” on page 1 ■...
Sun Fire T1000 Server Components FIGURE 1-2 Performance Enhancements The Sun Fire T1000 server introduces several new technologies with its sun4v architecture and multicore, multithreaded UltraSPARC T1 multicore processor. Sun Fire T1000 Server Service Manual • January 2006 ®...
Other software Java™ Enterprise System with a 90-day trial license For additional information on the Sun Fire T1000 server features refer to the Sun Fire T1000 Server Product Overview. Remote Manageability With ALOM The Sun Advanced Lights Out Manager (ALOM) feature is a system controller (SC) that enables to you remotely manage and administer the Sun Fire T1000 server.
Serviceability relates to the time it takes to restore a system to service following a system failure. Together, reliability, availability, and serviceability features provide for near continuous system operation. To deliver high levels of reliability, availability, and serviceability, the Sun Fire T1000 server offers the following features: Environmental monitoring ■...
PSH automated run time diagnosis capability that takes faulty components off ■ line. For more information about using RAS features, refer to the Sun Fire T1000 Server System Administration Guide. Environmental Monitoring The Sun Fire T1000 server features an environmental monitoring subsystem...
Predictive Self-Healing The Sun Fire T1000 server features the latest fault management technologies. With the Solaris 10 Operating System (OS), Sun is introducing a new architecture for building and deploying systems and services capable of predictive self-healing. Self- healing technology enables Sun systems to accurately predict component failures and mitigate many serious problems before they actually occur.
Additional Service Related Information In addition to this document, the following resources are available to help you keep your server running optimally: Product Notes – The Sun Fire T1000 Server Product Notes (819- ■ breaking information about the system including required software patches, updated hardware and compatibility information, and solutions to know issues.
Page 20
Sun Fire T1000 Server Service Manual • January 2006...
Sun Fire T1000 Server Diagnostics This chapter describes the diagnostics that are available for monitoring and troubleshooting the Sun Fire T1000 server. This chapter does not provide detailed troubleshooting procedures, but instead describes the Sun Fire T1000 server diagnostics facilities and how to use them.
Page 22
The flowchart assumes that you have already performed some rudimentary troubleshooting such as verification of proper installation, visual inspection of cables and power, and possibly reset server (For details, refer to the Sun Fire T1000 Server Installation Guide and Sun Fire T1000 Server Administration Guide.
Page 23
Find cause of Find cause of overtemp cond. overtemp Chapter 2 Sun Fire T1000 Server Diagnostics Numbers in this flowchart correspond to the Action numbers in Table 2-1. Connect power cord or replace faulty power supply.
Page 24
The showenvironment command reports over temperature conditions when the ambient room showenvironment temperature exceeds the upper limit. command. Sun Fire T1000 Server Service Manual • January 2006 For more information, see these sections “To Remove the Power Supply” on page 61 “To Replace the Power...
Page 25
Supply” on page 61 “To Replace the Power Supply” on page 62 “Collecting Information From Solaris OS Files and Commands” on page 39 “Running POST” on page 27 “Exercising the System with SunVTS” on page 43 Chapter 2 Sun Fire T1000 Server Diagnostics...
Sun for support. Using LEDs to Identify the State of Devices The Sun Fire T1000 server provides the following groups of LEDs: AC OK Front and rear panel LEDS ( ■...
Page 27
Power OK LED/power on/off button Sun Fire T1000 Server Front Panel FIGURE 2-2 Fault LED DC OK AC OK Sun Fire T1000 Server Rear Panel LEDs FIGURE 2-3 Locator Service required Link Activity Power OK LED Locator Service required Chapter 2 Sun Fire T1000 Server Diagnostics...
Power On/Off button Ethernet Activity LEDs Sun Fire T1000 Server Service Manual • January 2006 ). The LEDs are also provided on the rear panel. Color Description White Enables you to identify a particular server. The LED is controlled using one of the following methods: •...
The Sun Advanced Lights Out Manager (ALOM) is a system controller on the Sun Fire T1000 server motherboard that enables you to remotely manage and administer your server. ) are located on the back of the power supply. Chapter 2 Sun Fire T1000 Server Diagnostics...
Page 30
Therefore, ALOM firmware and software continue to function when the server operating system goes offline or when the server is powered off. Note – For comprehensive ALOM information, refer to the Sun Fire T1000 Server Advanced Lights Out Manager (ALOM) guide.
Connect an external modem to the network management port and dial-in to the ■ modem. Note – Refer to the Sun Fire T1000 Server Advanced Lights Out Manager (ALOM) Guide for instructions on configuring and connecting to ALOM. Chapter 2 Sun Fire T1000 Server Diagnostics...
ALOM commands for servicing a Sun Fire T1000 TABLE 2-4 server. For descriptions of all ALOM commands, issue the help command or refer to the Sun Fire T1000 Server Advanced Lights Out Management (ALOM) Guide. Service-Related ALOM Commands TABLE 2-4...
Displays the history of all events logged in the ALOM event buffer. Displays information about the host system’s hardware configuration, and whether the hardware is providing service. TABLE 2-7 Chapter 2 Sun Fire T1000 Server Diagnostics “To Run the “To Run the showfaults Command” “To Run the showfru Command” on...
System Temperatures (Temperatures in Celsius): -------------------------------------------------------------------------------- Sensor Status -------------------------------------------------------------------------------- MB/T_AMB MB/CMP0/T_TCORE MB/CMP0/T_BCORE MB/IOB/T_CORE -------------------------------------------------------- System Indicator Status: -------------------------------------------------------- SYS/LOCATE SYS/SERVICE Sun Fire T1000 Server Service Manual • January 2006 MB/CMP0/CH0/R1/D0 Host detected fault, UUID: a26d5379-24b8-4a46-bcbf-d9e1ff75a1bc Temp LowHard LowSoft LowWarn HighWarn HighSoft HighHard SYS/ACT Fault...
The showfru command displays information about the FRUs in the server. Use this command to see information about an individual FRU, or for all the FRUs. Note – You do not need user permissions to use this command. Sun Fire T1000 Server Service Manual • January 2006...
Page 37
/SPD/Vendor: Infineon (formerly Siemens) /SPD/Vendor Part No: TUE OCT 18 21:17:55 2005 ASSY,Sun-Fire-T1000,Motherboard Sriracha,Chonburi,Thailand 5017302 002989 Celestica T1000_MB 885-0505-04 SUN JUL 31 19:45:13 2005 PSU,300W,AC_INPUT,A207 Matamoros, Tamps, Mexico 3001799 G00001 Tyco Electronics 885-0407-02 72T256220HR3.7A 72T256220HR3.7A Chapter 2 Sun Fire T1000 Server Diagnostics...
Page 38
/SPD/Timestamp: MON OCT 03 12:00:00 2005 /SPD/Description: DDR2 SDRAM, 2048 MB /SPD/Manufacture Location: /SPD/Vendor: Infineon (formerly Siemens) /SPD/Vendor Part No: /SPD/Vendor Serial No: d03ec27 FRU_PROM at MB/CMP0/CH3/R1/D1/SEEPROM Sun Fire T1000 Server Service Manual • January 2006 72T256220HR3.7A 72T256220HR3.7A 72T256220HR3.7A 72T256220HR3.7A 72T256220HR3.7A...
The server can be configured for normal, extensive, or no POST execution. You can also control the level of tests that run, the amount of POST output that is displayed, and which reset events trigger POST by using ALOM variables. 72T256220HR3.7A Chapter 2 Sun Fire T1000 Server Diagnostics “Managing 40).
Page 40
* All of these parameters are set using the ALOM setsc command except for the setkeyswitch command. Sun Fire T1000 Server Service Manual • January 2006 Values Description The system can power on and run POST (based normal on the other parameter settings).
Page 41
Flowchart of ALOM Variable for POST Configuration FIGURE 2-5 Chapter 2 Sun Fire T1000 Server Diagnostics...
The setkeyswitch parameter is a command that sets the virtual keyswitch, so it does not use the setsc command. Example: sc> setkeyswitch diag Sun Fire T1000 Server Service Manual • January 2006 No POST Diagnostic Execution...
#. sc> setsc diag_mode normal 2. Set the virtual keyswitch to diag so that POST will run in service mode. sc> setkeyswitch diag Chapter 2 Sun Fire T1000 Server Diagnostics...
Page 44
Use is subject to license terms. 0:0>VBSC selecting POST IO Testing. 0:0>VBSC enabling threads: 1 0:0>VBSC setting verbosity level 3 0:0>Start Selftest... 0:0>Init CPU 0:0>Master CPU Tests Basic... 0:0>CPU =: 0 Sun Fire T1000 Server Service Manual • January 2006 Note: Some output omitted.
Page 45
POST error messages use the following syntax: c:s > ERROR: TEST = failing-test c:s > H/W under test = FRU c:s > Repair Instructions: Replace items in order listed by H/W Note: Some output omitted. Chapter 2 Sun Fire T1000 Server Diagnostics...
Page 46
Run the showfaults command to obtain additional fault information. The fault is captured by ALOM, where the fault is logged, the Service required LED is lit, and the faulty component is disabled. Sun Fire T1000 Server Service Manual • January 2006...
Using the Solaris Predictive Self-Healing Feature The Solaris OS predictive self-healing technology enables Sun Fire T1000 server to diagnose problems while the Solaris OS is running, and mitigate many serious problems before they occur. The Solaris OS uses the fault manager daemon, fmd(1M), which starts at boot time and runs in the background to monitor the system.
Page 48
Details ■ If the Solaris OS PSH facility has detected a faulty component, use the fmdump command to identify the fault. Note – Additional predictive self-healing information is available at: http://www.sun.com/msg. Sun Fire T1000 Server Service Manual • January 2006...
In this example, the message ID SUN4U-8000-2S returns the following information for corrective action: Step 2 to obtain more information UUID Oct 21 10:32 EDT 2004) ) that can be used to obtain additional SUNW4U-8000-2S R1/D0(J0701) Chapter 2 Sun Fire T1000 Server Diagnostics SUNW-MSG-ID a26d5379- that in...
Page 50
A service case should be opened and time scheduled to replace the FRU, identified in the fmdump(1M) output, on which the suspect DIMM is located. Sun Fire T1000 Server Service Manual • January 2006 fmdump -vu <event-id> to view the results of diagnosis and the specific Field Replaceable Unit (FRU) identified for repair.
Follow the suggested actions to repair the fault. Collecting Information From Solaris OS Files and Commands With the Solaris OS running on the Sun Fire T1000 server, you have the full compliment of Solaris OS files and commands available for collecting information and for troubleshooting.
ALOM, which would normally detect a FRU replacement and enable the FRU, does not do so. In this case, after the loose cable is reseated, the disabled component must be manually enabled. Sun Fire T1000 Server Service Manual • January 2006...
Removes a component from the asr-db blacklist, where asrkey is the component to enable. Adds a component to the asr-db blacklist, where asrkey is the component to disable. Removes all entries from the asr-db blacklist. Chapter 2 Sun Fire T1000 Server Diagnostics...
MB/CMP0/CH3/R1/D1 sc>SC Alert:MB/CMP0/CH3/R1/D1 disabled 2. After receiving confirmation that the disablecomponent command is complete, reset the server for so that the ASR command takes effect. sc> reset Sun Fire T1000 Server Service Manual • January 2006 Disabled Devices...
■ Checking Whether SunVTS Software Is Installed This procedure assumes that the Solaris OS is running on the Sun Fire T1000 server, and that you have access to the Solaris OS command line. Chapter 2 Sun Fire T1000 Server Diagnostics...
Before you begin, the Solaris OS must be running. You also need to ensure that SunVTS validation test software is installed on your system. See SunVTS Software Is Installed” on page Sun Fire T1000 Server Service Manual • January 2006 Description SunVTS framework...
SunVTS software can be run in several modes. This procedure assumes that you are using the default mode. This procedure also assumes that the Sun Fire T1000 server is headless—that is, it is not equipped with a monitor capable of displaying bit mapped graphics. In this case, you access the SunVTS GUI by logging in remotely from a machine that has a graphics display.
Page 58
If you have installed SunVTS software in a location other than the default /opt directory, alter the path in this command accordingly. The SunVTS GUI appears on the display system’s screen. Sun Fire T1000 Server Service Manual • January 2006...
Page 59
The SunVTS GUI Screen FIGURE 2-6 Chapter 2 Sun Fire T1000 Server Diagnostics...
Page 60
Tests are enabled when checked, and disabled when not checked. lists tests that are especially useful to run on a Sun Fire T1000 server. TABLE 2-8 Useful SunVTS Tests to Run on a Sun Fire T1000 Server...
Useful SunVTS Tests to Run on a Sun Fire T1000 Server TABLE 2-8 SunVTS Tests pmemtest, vmemtest, ramtest serialtest hsclbtest 7. (Optional) Customize individual tests. You can customize individual tests by right-clicking on the name of the test. For example, in the illustration under bg0(nettest) brings up a menu that enables you to configure this Ethernet test.
Page 62
Sun Fire T1000 Server Service Manual • January 2006...
Safety Information This section describes important safety information you need to know prior to removing or installing parts in the Sun Fire T1000 server. For your protection, observe the following safety precautions when setting up your equipment: Follow all Sun standard cautions, warnings, and instructions marked on the ■...
Sun systems. This document is located in the packing carton of your server. The Sun Fire T1000 server complies with regulatory requirements for safety and EMI. Document about compliance is available online at: http://www.sun.com/documentation...
Common Procedures for Parts Replacement Before you can remove and replace parts that are inside the Sun Fire T1000 server, you must perform the following procedures: “To Shut the System Down” on page 53 ■...
Page 66
Note – You can also use the Power On/Off button on the front of the server to initiate a graceful system shutdown. Refer to the Sun Fire T1000 Server Administration Guide for more information about the ALOM poweroff command. Sun Fire T1000 Server Service Manual • January 2006...
To Remove the Server From a Rack ▼ If the server is installed in a rack with the extendable slide rails that were supplied with the server, use this procedure to remove the server chassis from the rack. 1. (Optional) Issue the following command from the ALOM SC prompt to locate the system that requires maintenance: sc>...
Sun ESD mat, part number 250-1088 ■ Disposable ESD mat (shipped with some replacement parts or optional system ■ components) 2. Use an antistatic wrist strap. Sun Fire T1000 Server Service Manual • January 2006 ) to release the FIGURE 3-2...
To Remove the Top Cover ▼ Access to all customer replaceable units (CRUs) requires the removal of the top cover: Note – Never run the system with the top cover removed. The top cover must be in place for proper air flow. The cover interlock switch immediately shuts the system down when the cover is removed.
1. Perform the procedures described in Replacement” on page 2. Remove any cable(s) that are attached to the card. Sun Fire T1000 Server Service Manual • January 2006 “To Replace the Power Supply” “To Replace the Hard Drive” on “To Add or Replace DIMMs” on page 66 Appendix “Field-Replaceable Units (FRUs)”...
Page 71
3. On the rear of the chassis, release the retention latch () PCI Express card to the chassis. Releasing the PCI Express Card Retention Latch FIGURE 3-4 4. Gently work the PCI Express card out of the socket on the PCI Express riser board ) and the retention bracket.
Replacement” on page 2. Disconnect the fan power cable from the motherboard. 3. Release the tabs ( Sun Fire T1000 Server Service Manual • January 2006 “Common Procedures for Finishing Up” on “Common Procedures for Parts ) on both sides of the fan assembly.
Fan tray assembly Removing the Fan Tray Assembly FIGURE 3-6 4. Remove the fan assembly from the sheet metal mounting brackets. ▼ To Replace the Fan Tray Assembly 1. Unpackage the replacement fan tray assembly and place it on an antistatic mat. 2.
LED is not lit. 7. At the sc> prompt, issue the showenvironment command to verify the status of the power supply. Sun Fire T1000 Server Service Manual • January 2006 Power supply ) to lock the power supply into place in the chassis.
Fastener Replacing the Power Supply FIGURE 3-8 To Remove the Hard Drive ▼ 1. Perform the procedures described in Replacement” on page 2. Disconnect the cable from the hard drive. 3. Unsnap the catches on the latches ( remove the drive and tray assembly from the chassis. Latches Hard drive Figure showing how to remove the hard disk drive.
You might need to partition the drive, create file systems, load data from backups, or have it updated from a RAID configuration. Example: cfgadm -c configure c0t0d0s0C Sun Fire T1000 Server Service Manual • January 2006 FIGURE 3-10 “Common Procedures for Finishing Up” on...
To Remove DIMMs ▼ Caution – This procedure requires that you handle components that are sensitive to static discharges that can cause the component to fail. To avoid this problem, ensure that you follow antistatic practices as described in Discharge (ESD) Prevention Measures” on page 1.
■ DIMMs must be added four at a time. ■ Rank 0 memory must be fully populated for the Sun Fire T1000 to function ■ 1. Unpackage the replacement DIMMs and place them on an antistatic mat. 2. Ensure that the socket ejector tabs are in the open position.
Page 79
6. Perform the following steps to clear the memory fault. a. Gain access to the ALOM sc> prompt. Refer to the Sun Fire T2000 Server Advanced Lights Out Management (ALOM) Guide for instructions. b. Run the showfaults -v command to determine how to clear the fault: If the fault is a Host-detected fault (displays a UUID), such as the following: ■...
FRUs and associated cables from your chassis and install them in the new chassis. The FRUs to remove and replace and the procedures to remove and replace them are: Sun Fire T1000 Server Service Manual • January 2006 “Diagnostic Flow Chart” on page 11 for an...
1. Remove the PCI Express card. “To Remove the Optional PCI Express Card” on page 2. Remove the fan tray assembly and cable. “To Remove the Fan Tray Assembly” on page 3. Remove the power supply and cable. “To Remove the Power Supply” on page 61 4.
2. Using a small flat head screwdriver, carefully pry the battery ( motherboard. Removing the Clock Battery from the Motherboard FIGURE 3-12 Sun Fire T1000 Server Service Manual • January 2006 Appendix “Common Procedures for Finishing Up” on “Common Procedures for Parts “Field-Replaceable Units...
4. Use the ALOM setdate command to set the day and time. Use the setdate command before you power-on the host system. For details about this command, refer to the Sun Fire T1000 Server Advanced Lights Out Management (ALOM) Guide. ) with the + facing upward.
Set the cover down so that the cover hangs over the rear of the server by about an inch (2.5 cm). 2. Slide the cover forward until it latches into place. Sun Fire T1000 Server Service Manual • January 2006...
To Reinstall the Server Chassis in the Rack ▼ Refer to the Sun Fire T1000 System Installation Manual for installation instructions. After you have reinstalled the server chassis in the rack, reconnect all cables that you disconnected when you remover the chassis from the rack.
Page 86
Sun Fire T1000 Server Service Manual • January 2006...
A P P E N D I X Field-Replaceable Units (FRUs) shows the locations of the field-replaceable units (FRUs) in the Sun Fire FIGURE A-1 T1000 server. lists the FRUs. lists the locations of the DIMMs. TABLE A-1 TABLE A-2 The Channel/Rank/DIMM locations.
Page 88
Motherboard (1) Disk Field-Replaceable Units FIGURE A-1 Sun Fire T1000 Server Service Manual • January 2006...
Page 89
Sun Fire T1000 Server FRU List TABLE A-1 Replacement Item No. Instructions Motherboard “To Remove the and chassis Motherboard and assembly Chassis” on page 68 DIMMs “To Remove DIMMs” on page 65 Fan assembly “To Remove the Fan Tray Assembly” on...
Page 90
Location of DIMMs TABLE A-2 Connector Number J0501 J0601 J0701 J0810 J1001 J1101 J1201 J1301 Sun Fire T1000 Server Service Manual • January 2006 Location MB/CMP0/CH0/R0/D0 MB/CMP0/CH0/R0/D1 MB/CMP0/CH0/R1/D0 MB/CMP0/CH0/R1/D1 MB/CMP0/CH3/R0/D0 MB/CMP0/CH3/R0/D1 MB/CMP0/CH3/R1/D0 MB/CMP0/CH3/R1/D1...
Need help?
Do you have a question about the Sun Fire T1000 and is the answer not in the manual?
Questions and answers