Site Isolation; Handling Site Isolation - IBM TotalStorage NAS Gateway 500 Administrator's Manual

Hide thumbs Also See for TotalStorage NAS Gateway 500:
Table of Contents

Advertisement

2. When you are ready to initiate site failback, start the cluster on the recovering
3. Failback of the recovering site resources is handled automatically by
Reintegrating the failed site
Depending on the application, the surviving copy of the data might continue to
change while the mirror copy is not available. The state map device is used to
recover from these failures. When a failed site recovers and reintegrates, the state
map is automatically processed to synchronize the data on the devices. The
reintegration process proceeds as follows:
1. When the first remote node sends the message that it is ready to rejoin the
2. When the remote node successfully rejoins the cluster, all the configured
3. When nodes at a failed site reintegrate, the GeoMirror devices at both sites
4. The HACMP

Site isolation

Site isolation occurs when all HAGEO networks are not available.

Handling site isolation

The Cluster Manager might still be able to send heartbeats over a client network to
realize that the remote nodes are functioning. The Cluster Manager, using the
Remote Mirroring tools, recognizes that a global network_down event has occurred
and brings down the nondominant site to avoid data divergence as much as
possible. When the Remote Mirroring networks are functioning again, the
nondominant site can rejoin the cluster. The GeoMirror devices synchronize the
296
NAS Gateway 500 Administrator's Guide
site from the command line (alternatively, you can use SMIT or WebSM from the
Cluster Management menu). Either start the cluster on one node at a time with
Enable a Server in the Cluster (the clnasennode command), or Enable
Cluster (the clnasencluster command).
HACMP/XD. For more information, see "Reintegrating the failed site."
cluster, Remote Mirroring on the functioning local site suspends the regular
clustering reintegration process until the synchronization of the GeoMirror
devices is complete. The nodes at the local site continue to process data while
Remote Mirroring is bringing the remote node up to date.
Note: If the GeoMirror devices are already synchronized, this step does not
apply.
clustering applications become available immediately. After the first node is up,
the site is functioning. The remaining nodes, those that participate in mirroring
the devices that have already been synchronized once, rejoin at a faster rate.
They do not have to wait for the synchronization process across the geography.
check the state map values for corresponding data regions to see if data needs
to be transferred to the reintegrating device. The synchronization of the remote
GeoMirror devices is a time-consuming process if extensive alterations to
shared data occurred during the failure period. Synchronization must complete
before the applications on the recovering site that use the GeoMirror device can
be started. If the state maps on both sides of a device have cells marked stale,
you must manually update the state map before the device can be started. See
the section on data divergence to manually update the state map.
Cluster Manager will process the config_too_long event as soon
as the configuration time exceeds six minutes, which is highly likely for most
instances of synchronization. This is not cause for alarm. The event causes the
message to be displayed. The Cluster Manager continues the configuration
process regardless of the messages.

Advertisement

Table of Contents
loading

Table of Contents