Intel S2600CO Family Technical Product Specification page 39

Hide thumbs Also See for S2600CO Family:
Table of Contents

Advertisement

Functional Architecture Overview
3.2.2.5.2.1
Correctable Memory ECC Error Handling
A "Correctable ECC Error" is one in which a single-bit error in memory contents is detected and
corrected by use of the ECC Hamming Code included in the memory data. For a correctable
error, data integrity is preserved, but it may be a warning sign of a true failure to come. Note that
some correctable errors are expected to occur.
The system BIOS has logic to copy with the random factor in correctable ECC errors. Rather
than reporting every correctable error that occurs, the BIOS have a threshold and only logs a
correctable error when a threshold value is reached. Additional correctable errors that occur
after the threshold has been reached are disregarded. In addition, on the expectation the server
system may have extremely long operational runs without being rebooted, there is a "Leaky
Bucket" algorithm incorporated into the correctable error counting and comparing mechanism.
The "Leaky Bucket" algorithm reduces the correctable error count as a function of time – as the
system remains running for a certain amount of time, the correctable error count will "leak out"
of the counting registers. This prevents correctable error counts from building up over an
extended runtime.
The correctable memory error threshold value is a configurable option in the <F2> BIOS Setup
Utility, where you can configure it for 20/10/5/ALL/None.
Once a correctable memory error threshold is reached, the event is logged to the System Event
Log (SEL) and the appropriate memory slot fault LED is lit to indicate on which DIMM the
correctable error threshold crossing occurred.
3.2.2.5.2.2
Uncorrectable Memory ECC Error Handling
All multi-bit "detectable but not correctable" memory errors are classified as Uncorrectable
Memory ECC Errors. This is generally a fatal error.
However, before returning control to the OS drivers from the Machine Check Exception (MCE)
or Non-Maskable Interrupt (NMI), the Uncorrectable Memory ECC error is logged to the SEL,
the appropriate memory slot fault LED is lit, and the System Status LED state is changed to a
solid Amber.
3.2.2.5.3
Demand Scrubbing for ECC Memory
Demand scrubbing is the ability to write corrected data back to the memory once a correctable
error is detected on a read transaction. This allows for correction of data in memory at detect,
and decrease the chances of a second error on the same address accumulating to cause a
multi-bit error (MBE) condition.
Demand Scrubbing is enabled/disabled (default is enabled) in the Memory Configuration screen
in Setup.
3.2.2.5.4
Patrol Scrubbing for ECC Memory
Patrol scrubs are intended to ensure that data with a correctable error does not remain in DRAM
long enough to stand a significant chance of further corruption to an uncorrectable stage.
28
Intel order number G42278-002
Intel® Server Board S2600CO Family TPS
Revision 1.0

Hide quick links:

Advertisement

Table of Contents
loading

This manual is also suitable for:

S2600co series

Table of Contents