Disk Failure And Recovery; Spare Disks And Disk Failure - HP xw4400 Mini White Paper

Software raid in linux workstations

Hide thumbs Also See for xw4400:

Service and technical reference manual (210 pages)

Reference manual (190 pages)

Quickspecs (93 pages)

Table of Contents

In a RAID-10 configuration, you will need to add three lines to the /etc/fstab file, one for each of

the RAID arrays. There does not need to be a mount point specified for /dev/md0 or /dev/md1. If

no mount point is specified, you will see error messages during startup, but the RAID-10 array will still

initialize and mount correctly.

Please note that these configuration files are only meant as examples, and your /etc/raidtab file

will differ based on your specific hard drive configuration.

Disk Failure and Recovery

Spare Disks and Disk Failure

Spare disks are disks that do not take part in the RAID configuration until one of the active disks fails.

At that point, the failed device is marked as "bad" and reconstruction of the RAID array begins

immediately on the first available spare disk.

Tip:

When reconstruction occurs, if multiple bad blocks have built up on the

active disks over time, the reconstruction process can sometimes trigger the

failure of one of the "good" disks, leading to failure of the entire array.

However, performing regular filesystem checks (fsck) of the entire RAID

filesystem should almost completely eliminate this risk.

Spare disks are not required for a RAID configuration, but they are highly recommended. While most

RAID levels can handle the failure of one physical disk, the failure of another disk will cause the entire

array to fail, thus it is recommended to start rebuilding the array as quickly as possible after a disk

failure. When a disk fails, the crashed disk is marked as "faulty." Faulty disks still look and behave as

members of the RAID array; they are simply treated as inactive parts of the filesystem.

When a disk fails, information regarding the failure will appear in the standard log and stat files.

Looking in /proc/mdstat will show information regarding the drives in the RAID array. RAID role

numbers show the role that the disks play in the RAID configuration; for an array with n disks, disks

with RAID role numbers greater than or equal to n are designated spare disks. A failed disk will be

marked with an "F" and will be replaced with the device with the lowest role number greater than n

that is not failed.

Removing and replacing a failed disk can be done as follows:

Remove the failed disk from the RAID array by running the command:

raidhotremove /dev/md0 /dev/sdc2

where /dev/sdc2 is the name of the failed drive. If you wish to use mdadm instead of

raidtools, the command is:

mdadm /dev/md0 -r /dev/sdc2

Please note that raidhotremove can not be used to pull a disk out of a running array, and

should only be used for removing failed disks.

After recovery ends, a new disk should be designated as /dev/sdc2, or whichever disk was the

one that failed. This new disk now needs to be added to the array.

raidhotadd /dev/md1 /dev/sdc2

Using mdadm, the command is:

mdadm /dev/md1 -a /dev/sdc2

Table of Contents

Need help?

Do you have a question about the xw4400 and is the answer not in the manual?

This manual is also suitable for:

Xw9400 Xw6400 - workstation - 4 gb ram Xw6200 - workstation - 2 gb ram Xw8200 - workstation - 1 gb ram Xw8400 - workstation - 4 gb ram Xw9300 - workstation - 1 gb ram ... Show all

Disk Failure And Recovery; Spare Disks And Disk Failure - HP xw4400 Mini White Paper

Disk Failure and Recovery

Spare Disks and Disk Failure

Need help?

Subscribe to Our Youtube Channel

Related Manuals for HP xw4400

Related Content for HP xw4400

This manual is also suitable for:

Table of Contents