Sr870Bh2 Machine Check Error Handling; Classification Of Errors; Error Types - Dell PowerEdge 3250 Product Manual

Product guide (.pdf)
Hide thumbs Also See for PowerEdge 3250:
Table of Contents

Advertisement

Intel® Server Platform SR870BH2
5.

SR870BH2 Machine Check Error Handling

This section gives an overview of the implementation of machine check error handling on the
Server Platform SR870BH2. For additional details about Itanium-based system error generation
and error handling, refer to the Itanium® Processor Family Error Handling Guide (document
number: 249278-002) and the Itanium® System Abstraction Layer Specification (document
number: 245359-005). Both documents can be downloaded from the web at
www.developer.intel.com.
The goal of MCA is to contain errors and correct as many as possible before they propagate to
network or permanent storage. If an error cannot be fixed by the hardware or firmware, and the
OS cannot handle it, the machine shall be reset. MCA errors include ECC, BINIT, BERR,
SERR, and PERR. These conditions are handled by the BIOS through SAL 3.0-compatible
services.
5.1

Classification of Errors

Error events are classified by the processor and platform into three basic groups. This section
provides a summary of the different error types and signaling methods defined by the Itanium
Machine Check Architecture (MCA) and implemented in the Server Platform SR870BH2.
5.2

Error Types

Fatal: A fatal error is an error where the state has been corrupted and the error may, or
may not, be contained. The platform will signal a fatal error when the integrity of the
platform or subsystem cannot be determined. These errors cannot be corrected by
hardware, firmware, or system software. A reset of the system or subsystem is required.
Recoverable/Uncorrectable: An error has been detected that cannot be corrected by
hardware or firmware. However, the operating integrity of platform hardware and system
state has been maintained. These errors may or may not be recoverable (determined by
system software capabilities).
Correctable: An error has been detected and corrected by hardware, or by
processor/platform firmware.
Revision 1.1
SR870BH2 Machine Check Error Handling
7

Advertisement

Table of Contents
loading

Table of Contents