Frb1 - Bsp Self-Test Failures; Frb Debug Methodology - IBM eServer xSeries x382 Hardware Maintenance Manual And Troubleshooting Manual

Type 8834
Table of Contents

Advertisement

previously failing processor, logging the appropriate event into the SEL, and
displaying an appropriate error message to the user.

FRB1 - BSP self-test failures.

In addition to FRB-3 and FRB-2 timers, the BIOS provides FRB-1. Early in POST,
the BIOS checks the Built-in Self Test (BIST) results of the BSP. If the BSP fails
BIST, the BIOS requests the BMC to disable the BSP. The BMC disables the BSP,
selects a new BSP and generates a system reset. If there is no alternate processor
available, the BMC beeps the system speaker and enters into "final desperation
mode", a scheme whereby the system will attempt to boot in spite of failed
processors.
The BIOS and BMC implement additional safeguards to detect and disable the
application processors (AP) in a multiprocessor system. If an AP fails to complete
initialization within a certain time, it is assumed to be nonfunctional. If the BIOS
detects that an AP has failed BIST or is nonfunctional, it requests the BMC to
disable that processor. When the BMC disables the processor and generates a
system reset, the BIOS will not see the bad processor in the next boot cycle. The
failing AP is not listed in ACPI APIC tables, and is invisible to the OS.

FRB debug methodology:

All the failures (FRB-3, FRB-2, FRB-1, and AP failures) including the failing
processor are recorded into the SEL. The FRB-3 failure is recorded automatically
by the BMC, while the FRB-2, FRB-1, and AP failures are logged to the SEL by the
BIOS. In the case of an FRB-2 failure, some systems will log additional information
into the OEM data byte fields of the SEL entry. This additional data indicates the
last POST task that was executed before the FRB-2 timer expired. This information
may be useful for failure analysis.
The BIOS and BMC maintain failure history for each processor in nonvolatile
storage. This history is used to store a processor's track record. Once a processor
is marked "failed," it remains "failed" until the user forces the system to retest the
processor by entering BIOS Setup and selecting the "Retest processors" option.
The BIOS reminds the user about a previous processor failure during each boot
cycle until all processors have been retested and successfully passed the FRB tests
or AP initialization.
It is possible for all the processors in the system to be marked bad. If all the
processors are bad, the system, in final desperation mode, does not alter the BSP
and attempts to boot from the original BSP. Again, error messages are displayed on
the console and errors are logged in the SEL against a failing or non-healthy
processor, with the exception of the single processor case, where the error will be
logged, but failing desperation mode, there will be no video display.
If the user replaces a processor that has been marked bad by the system, the
system must be informed about this change by running BIOS Setup and selecting
the processor retest option.
User selection of the retest microprocessor option, in BIOS Setup, results in the
BIOS and BMC clearing the microprocessor failure history from their respective
non-volatile storage.
There are three possible states for each processor slot:
v Microprocessor installed (status only; indicates processor has passed BIOS
36
IBM eServer xSeries x382 Type 8834: Hardware Maintenance Manual and Troubleshooting Guide
POST).

Advertisement

Table of Contents
loading

Table of Contents