Service Processor Reboot/Restart Policy Controls; Processor Boot-Time Deconfiguration (Cpu Repeat Gard); Processor Run-Time Deconfiguration (Cpu-Gard) - IBM RS/6000 44P 270 Service Manual

Rs/6000 44p series
Hide thumbs Also See for RS/6000 44P 270:
Table of Contents

Advertisement

Service Processor Reboot/Restart Policy Controls

The operating system's automatic restart policy (see operating system documentation)
defines the operating system's response to a system crash. The service processor can
be instructed to refer to that policy by the Use OS-Defined Restart Policy setup menu.

Processor Boot-Time Deconfiguration (CPU Repeat Gard)

Processor boot time deconfiguration allows for the removal of processors from the
system configuration at boot time. The objective is to minimize system failure or data
integrity exposure due to a faulty processor.
This function uses processor hardware built-in self-test (BIST) and firmware power-on
self-test (POST) to discover and isolate processor hardware failures during boot time. It
also uses the hardware error detection logic in the processor to capture run-time
recoverable and irrecoverable errors. The firmware uses the error signatures in the
hardware to analyze and isolate the error to a specific processor.
The processors that are deconfigured remain off-line for subsequent reboots until the
faulty processor hardware is replaced.
This function allows users to manually deconfigure or re-enable a previously
deconfigured processor through the service processor menu. The user can also enable
or disable this function through the service processor.

Processor Run-Time Deconfiguration (CPU-Gard)

Processor run-time deconfiguration allows for the dynamic removal of CPUs from the
system configuration. The objective is to minimize system failures or data integrity
exposures due to a faulty processor. The processor to be removed is the one that has
experienced repeated run-time recoverable internal errors (over a predefined threshold).
The function uses the hardware error detection logic in the processor to capture
run-time recoverable errors. The firmware uses the error signatures in the hardware to
analyze and isolate the error to a specific CPU. The firmware also maintains
error-threshold information.
When the number of internal recoverable errors for a processor reaches a predefined
threshold, the firmware notifies the AIX operating system. The AIX operating system
migrates all software processes and interrupts to another processor and puts the faulty
processor in stop state.
CPUs that are deconfigured at run time remain off-line for subsequent reboots through
the CPU Boot Time Deconfiguration function, until the faulty CPU hardware is replaced.
The user can also enable or disable this function via the AIX system management
function.
Chapter 7. Using the Service Processor
177

Hide quick links:

Advertisement

Table of Contents
loading

This manual is also suitable for:

7044-270

Table of Contents