IBM Power Systems 775 Manual page 95

For aix and linux hpc solution
Table of Contents

Advertisement

applications. When a disk fails, erased data is rebuilt by using all of the operational disks in
the declustered array, the bandwidth of which is greater than the fewer disks of a conventional
RAID group. If another disk fault occurs during a rebuild, the number of impacted tracks that
require repair is markedly less than the previous failure and less than the constant rebuild
overhead of a conventional array.
The decrease in declustered rebuild impact and client overhead might be a factor of three to
four times less than a conventional RAID. Because GPFS stripes client data across all the
storage nodes of a cluster, file system performance becomes less dependent upon the speed
of any single rebuilding storage array.
Figure 1-59 Lower rebuild overhead in conventional RAID versus declustered RAID
When a single disk fails in the 1-fault-tolerant 1 + 1 conventional array on the left, the
redundant disk is read and copied onto the spare disk, which requires a throughput of seven
strip I/O operations. When a disk fails in the declustered array, all replica strips of the six
impacted tracks are read from the surviving six disks and then written to six spare strips, for a
throughput of two strip I/O operations. As shown in Figure 1-59, disk read and write I/O
throughput during the rebuild operations.
Disk configurations
This section describes recovery group and declustered array configurations.
Recovery groups
GPFS Native RAID divides disks into recovery groups in which each disk is physically
connected to two servers: primary and backup. All accesses to any of the disks of a recovery
group are made through the active primary or backup server of the recovery group.
Building on the inherent NSD failover capabilities of GPFS, when a GPFS Native RAID server
stops operating because of a hardware fault, software fault, or normal shutdown, the backup
GPFS Native RAID server seamlessly assumes control of the associated disks of its recovery
groups.
81
Chapter 1. Understanding the IBM Power Systems 775 Cluster

Advertisement

Table of Contents
loading

Table of Contents