A+ Qcm Move Example - IBM Power Systems 775 Manual

For aix and linux hpc solution
Table of Contents

Advertisement

Report the problem to IBM and open a PMR.
5.4.7 Hot, warm, and cold policies
The following polices are available for using the Fail in Place nodes in A+:
Hot swap policy
compute processing power.
Warm swap policy
use with any production workload.
Cold swap policy
needed.
The IBM Power Systems 775 includes more compute nodes. The specific amount of
resources is determined by IBM during the planning phase, and this hardware is available for
the customer without paying any extra charges.
The additional resources are used as added compute nodes, test systems, and so on.
xCAT administrator: The xCAT administrator must determine how to use the resources
and how to enable, allocate, and apply these resources according to the three policies. The
A+ functionality must be maintained manually to track the Fail in Place resources allocation
for the application workloads. This function is not automated.

5.4.8 A+ QCM move example

A QCM move (as shown in Figure 5-2 on page 307) is the first approach to move resources
within the CEC drawer.
If a failure in a non-compute QCM that is used for GPFS, the functions and (possibly) the
associated PCI slots must be moved to a fully functional compute QCM within the same
functional CEC.
The replacement QCM is the next QCM. For example, if QCM0 is failing, the next octant
QCM1 takes over the functions of QCM0.
To perform this move, the swapnode command (xCAT) is used to drain the original octant and
move it to the next octant. The swapnode command performs the following tasks:
All of the location information in the databases between the two nodes is swapped,
including the attributes for ppc tables and nodepos tables.
When the swap occurs in the same CEC, the slot assignments between the two LPARs
also are swapped.
306
IBM Power Systems 775 for AIX and Linux HPC Solution
: The node is fully in use and provides more productive or non-productive
: The node is made available for the Power 775 cluster, but are not in
: The resource is powered off and must be brought online when it is

Advertisement

Table of Contents
loading

Table of Contents