Memory Fault Handling Overview - Fujitsu SPARC ENTERPRISE T5140 Service Manual

Fujitsu server user manual
Hide thumbs Also See for SPARC ENTERPRISE T5140:
Table of Contents

Advertisement

"FB-DIMM Configuration Guidelines for SPARC Enterprise T5140 Servers" on
page 94
"FB-DIMM Configuration Guidelines for SPARC Enterprise T5240 Servers" on
page 98

Memory Fault Handling Overview

A variety of features play a role in how the memory subsystem is configured and
how memory faults are handled. Understanding the underlying features helps you
identify and repair memory problems. This section describes how the server deals
with memory faults.
The following server features manage memory faults:
POST – By default, POST runs when the server is powered on.
For correctable memory errors (CEs), POST forwards the error to the Solaris
Predictive Self-Healing (PSH) daemon for error handling. If an uncorrectable
memory fault is detected, POST displays the fault with the device name of the
faulty FB-DIMMs, and logs the fault. POST then disables the faulty FB-DIMMs.
Depending on the memory configuration and the location of the faulty FB-DIMM,
POST disables half of physical memory in the system, or half the physical memory
and half the processor threads. When this offlining process occurs in normal
operation, you must replace the faulty FB-DIMMs based on the fault message and
enable the disabled FB-DIMMs with the ILOM command set device
component_state=enabled where device is the name of the FB-DIMM being
enabled (for example, set /SYS/MB/CMP0/BR0/CH0/D0 component_state=
enabled).
Solaris Predictive Self-Healing (PSH) technology – PSH uses the Fault Manager
daemon (fmd) to watch for various kinds of faults. When a fault occurs, the fault
is assigned a unique fault ID (UUID), and logged. PSH reports the fault and
suggests a replacement for the FB-DIMMs associated with the fault.
If you suspect the server has a memory problem, run the ILOM show faulty
command. This command lists memory faults and identifies the FB-DIMM modules
associated with the fault.
Related Information
"POST Overview" on page 35
"Solaris PSH Feature Overview" on page 44
"PSH-Detected Fault Console Message" on page 45
"Identify Faulty FB-DIMMs Using the show faulty Command" on page 83
"Identify Faulty FB-DIMMs Using the FB-DIMM Fault Locator Button" on page 83
82
SPARC Enterprise T5140 and T5240 Servers Service Manual • July 2009

Advertisement

Table of Contents
loading

This manual is also suitable for:

Sparc enterprise t5240

Table of Contents