Thresholding; Sel Event Log Format For Machine Check Errors - Dell PowerEdge 3250 Product Manual

Product guide (.pdf)
Hide thumbs Also See for PowerEdge 3250:
Table of Contents

Advertisement

SR870BH2 Machine Check Error Handling
5.5

Thresholding

MCA errors are classified into one of three categories: corrected, recoverable, and fatal. In
general, corrected errors will not affect the operation of the system and therefore may occur
repeatedly (fatal and most recoverable errors result in a system reset.) In some cases, such as
a stuck bit in a memory DIMM, a corrected error may occur with a very high frequency. In this
scenario, the system may experience performance degradation due to excessive amounts of
time spent in the error logging routines. In addition, the BMC SEL has a finite size and may be
quickly filled with duplicate errors. To help alleviate these problems, a thresholding algorithm
has been applied to the BMC SEL logging routines. If the threshold is crossed, a special "event
disabled" SEL entry will be created and the BMC SEL logging code will not attempt to send
future platform event message commands for that error type to the BMC.
This greatly reduces the amount of time spent in the SEL logging routines and avoids
overrunning the BMC SEL log storage. This thresholding in no way affects the ability of the OS
to receive notification and service CPEIs or CMCIs, nor does it disable any error correction logic
in the chipset. Any disabled event reporting will be re-enabled on the next reboot.
Corrected errors are grouped into four categories: Processor, Memory, PCI PERR, and Generic
Bus. History for each category is maintained separately. Thresholding does not apply to
Recoverable or Fatal errors, only corrected errors. On the SR870BH2, the maximum number of
errors that can occur for each category is "10", within one hour. If this threshold is crossed, a
special 'Event Logging Disabled' SEL entry will be logged.
5.6

SEL Event Log Format for Machine Check Errors

The following table shows the machine check errors that will be logged for the SR870BH2, and
the corresponding SEL Event Log format. For details on System Management BIOS (SMBIOS)
Type 4, Type 16 and 17, refer to the System Management BIOS Reference Specification
available on www.dmtf.org.
10
Intel® Server Platform SR870BH2
Revision 1.1

Advertisement

Table of Contents
loading

Table of Contents