Error Correction And Parity Checking; Predictive Self-Healing - Sun Microsystems Sun Fire T2000 Service Manual

Table of Contents

Advertisement

threshold or rises above a high-temperature threshold, the monitoring subsystem
software lights the amber Service Required LEDs on the front and back panel. If the
temperature condition persists and reaches a critical threshold, the system initiates a
graceful server shutdown.
All error and warning messages are sent to the system controller (SC), console, and
are logged in the ALOM CMT log file. Additionally, some FRUs such as power
supplies provide LEDs that indicate a failure within the FRU.
2.1.4.5

Error Correction and Parity Checking

The UltraSPARC T1 multicore processor provides parity protection on its internal
cache memories, including tag parity and data parity on the D-cache and I-cache.
The internal 3 Myte L2 cache has parity protection on the tags, and ECC protection
of the data.
Advanced ECC, also called chipkill, corrects up to 4-bits in error on nibble
boundaries, as long as the bits are all in the same DRAM. If a DRAM fails, the
DIMM continues to function.
2.1.5

Predictive Self-Healing

The server features the latest fault management technologies. The Solaris 10
Operating System (OS), introduces a new architecture for building and deploying
systems and services capable of Predictive Self-Healing. Self-healing technology
enables systems to accurately predict component failures and mitigate many serious
problems before they occur. This technology is incorporated into both the hardware
and software of the server.
At the heart of the Predictive Self-Healing capabilities is the Solaris Fault Manager, a
service that receives data relating to hardware and software errors, and
automatically and silently diagnoses the underlying problem. Once a problem is
diagnosed, a set of agents automatically responds by logging the event, and if
necessary, takes the faulty component offline. By automatically diagnosing
problems, business-critical applications and essential system services can continue
uninterrupted in the event of software failures, or major hardware component
failures.
2-8
Sun Fire T2000 Server Service Manual • July 2007

Advertisement

Table of Contents
loading

This manual is also suitable for:

Fire t2000

Table of Contents