System Reliability, Availability, And Serviceability - Sun Microsystems Sun Fire T2000 Service Manual

Table of Contents

Advertisement

distributed or physically inaccessible machines. In addition, ALOM CMT enables
you to run diagnostics (such as POST) remotely that would otherwise require
physical proximity to the server's serial port.
You can configure ALOM CMT to send email alerts of hardware failures, hardware
warnings, and other events related to the server or to ALOM CMT. The ALOM CMT
circuitry runs independently of the server, using the server's standby power.
Therefore, ALOM CMT firmware and software continue to function when the server
operating system goes offline or when the server is powered off. ALOM CMT
monitors the following server components:
CPU temperature conditions
Hard drive status
Enclosure thermal conditions
Fan speed and status
Power supply status
Voltage levels
Faults detected by POST (power-on self-test)
Solaris Predictive Self-Healing (PSH) diagnostic facilities
For information about configuring and using the ALOM system controller, refer to
the latest Advanced Lights Out Manager (ALOM) CMT Guide.
2.1.4

System Reliability, Availability, and Serviceability

Reliability, availability, and serviceability (RAS) are aspects of a system's design that
affect its ability to operate continuously and to minimize the time necessary to
service the system. Reliability refers to a system's ability to operate continuously
without failures and to maintain data integrity. System availability refers to the
ability of a system to recover to an operational state after a failure, with minimal
impact. Serviceability relates to the time it takes to restore a system to service
following a system failure. Together, reliability, availability, and serviceability
features provide for near continuous system operation.
To deliver high levels of reliability, availability, and serviceability, the server offers
the following features:
Hot-pluggable hard drives
Redundant, hot-swappable power supplies (two)
Redundant hot-swappable fan units (three)
Environmental monitoring
Error detection and correction for improved data integrity
Easy access for most component replacements
Extensive POST tests that automatically delete faulty components from the
configuration
2-6
Sun Fire T2000 Server Service Manual • July 2007

Advertisement

Table of Contents
loading

This manual is also suitable for:

Fire t2000

Table of Contents