IBM Power Systems 775 Manual page 324

For aix and linux hpc solution
Table of Contents

Advertisement

5.4.9 A+ non-compute node overview
Compared to a compute node, a non-compute node A+ scenario is more sensitive regarding
the tasks that must be performed to recover and make the non-compute node available again.
You use a target compute node and a spare compute node to accomplish the swap. The
target compute node is the new non-compute node after the swap is performed. The spare
compute node is the node that fulfills the workload of the former compute node.
Non-compute nodes are defined in Table 5-7.
Table 5-7 non-Compute node configurations
Non-compute node type
Service node
GPFS storage node
Login node
Other node types
When a compute node is available to swap the resources, determine which node you must
use to restore the non-compute node functions. Table 5-8 provides an overview of the tasks
that must be performed based on the state of the A+ compute node.
Table 5-8 Compute node actions
Compute node location
In drawer with the failed
node
In backup drawer
310
IBM Power Systems 775 for AIX and Linux HPC Solution
Partition and recovery information
A service node is critical to maintaining operation on the nodes that
it services. A disk and an Ethernet adapter are assigned to the
node.
The GPFS storage node features SAS adapters that are assigned
to it. If the GPFS storage node is still operational, ensure that there
is an operational backup node or nodes for the set of disks that it
owns before proceeding.
The login node features an Ethernet adapter that is assigned to it.
If the login node is still operational, ensure that there is another
operational login node or nodes before proceeding.
Other non-Compute nodes often include adapters that are
assigned to them. If this node provides a critical function to the
cluster and it is still operational, you must confirm that a backup
node is available to take over its function.
Compute node state
Action to be performed on compute node
Hot spare
Prevent new jobs from starting and drain jobs
from the compute node
Warm spare
Prevent new jobs from starting and boot the
partition
Cold spare
No action required
Workload resource
Prevent new jobs from and starting drain jobs
from the compute node
Hot spare
Prevent new jobs from starting and drain jobs
from the compute node
Warm spare
Prevent new jobs from starting and boot the
partition
Cold spare
No action required
Workload resource
Prevent new jobs from starting and drain jobs
from the compute node

Advertisement

Table of Contents
loading

Table of Contents