IBM Storwize V7000 Unified Problem Determination Manual page 201

Table of Contents

Advertisement

v GPFS and CTDB must both be in a healthy state to run some of the commands
that follow.
For storage system recovery, see the procedure for recovering a storage system.
About this task
This procedure provides steps to recover a GPFS file system after a failure of the
block storage system. The file volumes were offline and are now back online after
a repair or recovery action. The disks referred to in this procedure are the volumes
that are provided by the block storage system.
Note: Because no I/O can be done by GPFS, it is assumed for these procedures
that the storage unit failure caused the GPFS file system to unmount.
After satisfying the prerequisites above, take the following steps:
Procedure
1. Verify that GPFS is running on both file modules by using the lsnode -r
command.
The column GPFS status shows active.
2. In the lsnode -r command output, verify that the CTDB status is also active.
If the CTDB status shows the value unhealthy, see "Checking CTDB health" on
page 169 for steps to resolve the CTDB status.
3. With GPFS functioning normally on both file modules, ensure that all disks in
the file system are available by running the lsdisk -r command. The
Availability column shows Up.
4. Issue the chkfs file_system_name -v | tee /ftdc/chkfs_fs_name.log1
command to capture the output to a file.
Review the output file for errors and save it for IBM support to investigate any
problems.
If the file contains a TSM ERROR message, perform the following steps:
a. Issue the stopbackup -d file_system_name command and the stoprestore -d
file_system_name command to stop any backup or restore operation.
b. Validate that no error occurred while stopping any Tivoli Storage Manager
service.
c. Issue the chkfs file_system_name -v | tee /ftdc/chkfs_fs_name.log2
command to recapture the output to a file.
d. Issue the startrestore command and the startbackup command to enable
Tivoli Storage Manager.
If you receive an error message (the number of mounted or used file modules
does not matter) at step 5 of the command internal execution steps like the
following,
(5/9) Performing mmfsck call for the file system check stderr:
Cannot check. "gpfs0" is mounted on 1 node(s) and in use on 1 node(s).
mmfsck: Command failed.
Examine previous error messages to determine cause.
perform the following steps:
a. Monitor the lsmount -r command until the mount status changes to not
mounted.
b. Issue the chkfs file_system_name command again.
Review the new output file for errors and save it for IBM support to investigate
any problems. It is expected that the file contains Lost blocks were found
177
Chapter 4. File module

Advertisement

Table of Contents
loading

Table of Contents