To configure the watchdog timer value, do the following steps:
1. On the PIC Main Menu Bar, click the Configure->Watchdog Timer Value menu selection.
2. Update the timer value.
3. Click <OK>.
ECC Memory
PIC reports memory status information, both memory arrays and memory devices, for systems that
support ECC memory. The ECC memory subsystems can detect and report both single-bit errors
(SBE) and multiple-bit errors (MBE).
Depending on the managed server hardware, memory devices are either SIMMs or DIMMs.
Memory device references use the appropriate device name in the PIC Console software.
Single-Bit Error Handling
If a SBE occurs, the system generates an SMI that allows the BIOS to log information about the
error in the System Event Log (SEL). This information identifies the exact SIMM or DIMM in
which the error occurred. Because this condition is recoverable, BIOS returns the system to normal
operation after logging the error.
This error is indicated in the health branch of the PIC Console software as a noncritical condition,
the requested event actions are carried out, and a noncritical error count is incremented on the
Sensor Settings tab page of the software.
Also, the following actions occur in the PIC Console software:
•
The Device Error Type is set to SBE on the Sensor Information tab page for the Memory
Device.
•
The Last Error Update value is set to "During PIC Runtime," indicating the update occurred
while the system was operational.
The BIOS stops logging noncritical SBEs when the SBE error count reaches nine. This prevents
the errors from filling the SEL. Upon system reboot, the OS uses the SEL records, along with the
results from its own memory test, to map out bad memory by reducing the usable size of a memory
bank to avoid using the bad memory element(s). This elimination of hard errors is a precaution
that prevents SBEs from becoming MBEs after the system has booted, and also to prevent SBEs
from being detected and logged each time the failed location(s) are accessed. Upon reboot, the
SBE error count is set to zero in the SEL.
Multiple-Bit Error Handling
If an MBE occurs, the system generates an SMI that allows the BIOS to log information about the
error in the SEL, identifying the memory bank in which the error occurred. However, on some
systems, it is not possible to determine the exact SIMM/DIMM that caused an MBE.
Because an MBE is a critical condition, upon logging the error the BIOS generates an NMI that
halts the system. Upon rebooting the server, this error is indicated as a critical condition on the
Memory Array and Memory Device in the health branch of the PIC Console software. The
30