What To Check After Running The System Recovery - IBM Storwize V7000 Unified Problem Determination Manual

Hide thumbs Also See for Storwize V7000 Unified:

Table of Contents

Example

Perform the following steps to recover an offline volume after the recovery

procedure has completed:

1. Delete all IBM FlashCopy function mappings and Metro Mirror or Global

Mirror relationships that use the offline volumes.

2. Run the recovervdisk or recovervdiskbysystem command. (This will only bring

the volume back online so that you can attempt to deal with the data loss.)

Contact IBM Remote Technical Support to help you with recovering from file

volumes that have been corrupted by data lost from the write-cache. They

might ask you to refer to "Recovering a GPFS file system" on page 161 and

help you with interpreting the results from the chkfs CLI command.

Refer to "What to check after running the system recovery" for what to do

with volumes that have been corrupted by the loss of data from the

write-cache.

4. Recreate all FlashCopy mappings and Metro Mirror or Global Mirror

relationships that use the volumes.

What to check after running the system recovery

Several tasks must be performed before you use the system.

The recovery procedure performs a recreation of the old system from the quorum

data. However, some things cannot be restored, such as cached data or system data

managing in-flight I/O. This latter loss of state affects RAID arrays managing

internal storage. The detailed map about where data is out of synchronization has

been lost, meaning that all parity information must be restored, and mirrored pairs

must be brought back into synchronization. Normally this results in either old or

stale data being used, so only writes in flight are affected. However, if the array

had lost redundancy (such as syncing, or degraded or critical RAID status) prior to

the error requiring system recovery, then the situation is more severe. Under this

situation you need to check the internal storage:

v Parity arrays will likely be syncing to restore parity; they do not have

redundancy when this operation proceeds.

v Because there is no redundancy in this process, bad blocks may have been

created where data is not accessible.

v Parity arrays could be marked as corrupt. This indicates that the extent of lost

data is wider than in-flight IO, and in order to bring the array online, the data

loss must be acknowledged.

v Raid-6 arrays that were actually degraded prior the system recovery may require

a full restore from backup. For this reason, it is important to have at least a

capacity match spare available.

Be aware of these differences regarding the recovered configuration:

v FlashCopy mappings are restored as "idle_or_copied" with 0% progress. Both

volumes must have been restored to their original I/O groups.

v The management ID is different. Any scripts or associated programs that refer to

the system-management ID of the clustered system (system) must be changed.

v Any FlashCopy mappings that were not in the "idle_or_copied" state with 100%

progress at the point of disaster have inconsistent data on their target disks.

These mappings must be restarted.

v Intersystem remote copy partnerships and relationships are not restored and

must be re-created manually.

Chapter 5. Control enclosure

251

Table of Contents

What To Check After Running The System Recovery - IBM Storwize V7000 Unified Problem Determination Manual

What to check after running the system recovery

Troubleshooting

Related Manuals for IBM Storwize V7000 Unified

Related Content for IBM Storwize V7000 Unified

Table of Contents