Detecting - IBM BladeCenter PS700 Technical Overview And Introduction

Hide thumbs Also See for BladeCenter PS700:
Table of Contents

Advertisement

4.4.1 Detecting

The first and most crucial component of a solid serviceability strategy is the ability to detect
errors accurately and effectively when they occur. Although not all errors are a guaranteed
threat to system availability, those that go undetected can cause problems because the
system does not have the opportunity to evaluate and act if necessary. POWER
processor-based systems employ IBM System z® server-inspired error detection
mechanisms that extend from processor cores and memory to power supplies and hard
drives.
Service processor
The service processor is a separate microprocessor from the main instruction processing
complex. The service processor provides the capabilities for the following elements:
POWER Hypervisor (system firmware), Integrated Virtualization Manager (IVM), and
BladeCenter Advanced Management Module (AMM) coordination
Remote power control options
Reset and boot features
Environmental monitoring
The service processor monitors the server's built-in temperature sensors and sends this
information to the BladeCenter AMM. The AMM can send instructions to the BladeCenter
fans to increase rotational speed when the ambient temperature is beyond the normal
operating range. Using an architected operating system interface, the service processor
notifies the operating system of potential environmental problems so that the system
administrator can take appropriate corrective actions before a critical failure threshold is
reached.
The service processor can also post a warning and initiate an orderly system shutdown in
the following circumstances:
– The operating temperature exceeds the critical level (for example, failure of air
conditioning or air circulation around the system).
– The system fan speed is out of operational specification (for example, because of
multiple fan failures).
– The server input voltages are out of operational specification.
The service processor can immediately shut down a system in the following
circumstances:
– Temperature exceeds the critical level or if the temperature remains beyond the
warning level for too long
– Internal component temperatures reach critical levels
– Non-redundant fan fails
Mutual surveillance
The service processor monitors the operation of the POWER Hypervisor firmware during
the boot process and watches for loss of control during system operation. It also allows
the POWER Hypervisor to monitor service processor activity. The service processor can
take appropriate action, including calling for service, when it detects the POWER
Hypervisor firmware has lost control. Likewise, the POWER Hypervisor can request a
service processor repair action if necessary.
Availability
The auto-restart (reboot) option, when enabled by the BladeCenter AMM, can reboot the
system automatically following AC power failure.
110
IBM BladeCenter PS700, PS701, and PS702 Technical Overview and Introduction

Hide quick links:

Advertisement

Table of Contents
loading

This manual is also suitable for:

Bladecenter ps701Bladecenter ps702

Table of Contents