Error Reporting And Handling; Fault Resilient Booting (Frb); Frb1 - Bsp Self-Test Failures; Frb2 - Bsp Post Failures - Intel SE7520JR2 Technical Manual

Server board technical product specification
Table of Contents

Advertisement

Intel® Server Board SE7520JR2
6.

Error Reporting and Handling

This section defines how errors are handled. Also discussed is the role of the BIOS in error
handling and the interaction between the BIOS, platform hardware, and server management
firmware with regard to error handling. In addition, error-logging techniques are described and
beep codes and POST messages are defined.
Note: The generic term "BMC" may be used throughout this secton when a feature and/or
function being described is common to both the mBMC and the Sahalee BMC. If a described
feature or function is unique, the specific management controller will be referenced.
6.1

Fault Resilient Booting (FRB)

Fault Resilient Booting (FRB) is a set of BIOS and BMC algorithms and hardware support that
allow a multiprocessor system to boot in case of failure of the bootstrap processor (BSP) under
certain conditions. FRB functionality will differ depending on whether standard onboard
platform instrumentation is used (mBMC) or whether an Intel Management Module is used.
With on-board platform instrumentation, should a processor failure be detected during POST,
the mBMC does not have the ability to disable the failed or failing processor. Therefore the
system may or may not continue to boot. A FRB-2 error will be generated to the System Event
Log (SEL) and an error will be displayed at POST. FRB2 is a BIOS-based algorithm that uses
the mBMC IPMI watchdog timer to protect against BIOS hangs during the POST process
On systems that have an Intel Management Module installed, several different levels of FRB are
supported: FRB1, FRB2, FRB3, and OS Watchdog Timer. The FRB algorithms detect BSP
failures and take steps to disable that processor and reset the system so another processor will
run as the BSP.
6.1.1
FRB1 – BSP Self-Test Failures
The BIOS provides an FRB1 timer. Early in POST, the BIOS checks the Built-in Self Test
(BIST) results of the BSP. If the BSP fails BIST, the BIOS requests the Sahalee BMC to disable
the BSP. The Sahalee BMC disables the BSP, selects a new BSP and generates a system
reset. If there is no alternate processor available, the Sahalee BMC generates a beep code and
halts the system. If the Sahalee BMC is not installed, then BIOS can only notify the user that
the BIST failed; no processors will be disabled.
The BIST failure is displayed during POST and an error is logged to the SEL.
6.1.2
FRB2 – BSP POST Failures
A second timer (FRB2) is set to several minutes by BIOS and is designed to guarantee that the
system completes POST. The FRB2 timer is enabled just before the FRB3 timer is disabled to
prevent any "unprotected" window of time. Near the end of POST, the BIOS disables the FRB2
timer. If the system contains more than 1 GB of memory and the user chooses to test every
DWORD of memory, the watchdog timer is extended before the extended memory test starts,
because the memory test can exceed the timer duration. The BIOS will also disable the
watchdog timer before prompting the user for a boot password. If the system hangs during
POST, before the BIOS disables the FRB2 timer, the Sahalee BMC generates an asynchronous
Revision 1.0
C78844-002
Error Reporting and Handling
149

Advertisement

Table of Contents
loading

This manual is also suitable for:

Se7520jr2atad2

Table of Contents