1920 Error - IBM V7000 Introduction And Implementation Manual

Flex system storage node
Table of Contents

Advertisement

9.4.1 1920 error

Let us focus first on the 1920 error. A 1920 error (event ID 050010) can have several triggers.
Official probable cause projections are:
Primary 2145 cluster or SAN fabric problem (10%)
Primary 2145 cluster or SAN fabric configuration (10%)
Secondary 2145 cluster or SAN fabric problem (15%)
Secondary 2145 cluster or SAN fabric configuration (25%)
Inter-cluster link problem (15%)
Inter-cluster link configuration (25%)
In practice, the error that is most often overlooked is latency. Global Mirror has a
round-trip-time tolerance limit of 80 ms. A message sent from your source SAN Volume
Controller cluster to your target SAN Volume Controller Cluster and the accompanying
acknowledgement must have a total time of 80 ms or 40 ms each way (for Version 4.1.1.x and
later.)
Round-Trip Time (RTT): For Version 4.1.0.x and earlier, this limit was 68 ms or 34 ms
one way for Fibre Channel extenders, and for SAN routers it was 10 ms one way or 20 ms
round trip. Make sure to use the correct values for the correct versions!
The primary component of your round-trip time is the physical distance between sites. For
every 1000 km (621.36 miles), there is a 5 ms delay. This delay does not include the time
added by equipment in the path. Every device adds a varying amount of time, depending on
the device, but expect about 25 µs for pure hardware devices. For software-based functions
(such as compression implemented in software), the delay added tends to be much higher
(usually in the millisecond plus range.)
Consider an example. Company A has a production site that is 1900 km distant from their
recovery site. Their network service provider uses a total of five devices to connect the two
sites. In addition to those devices, Company A employs a SAN Fibre Channel Router at each
site to provide FCIP to encapsulate the Fibre Channel traffic between sites. There are now
seven devices, and 1900 km of distance delay. All the devices add 200 µs of delay
The distance adds 9.5 ms each way, for a total of 19 ms. Combined with the device latency,
that is 19.4 ms of
Global Mirror, but this number is the best case number. Link quality and bandwidth play a
significant role here. Your network provider likely guarantees a latency maximum on your
network link; be sure to stay below the Global Mirror RTT limit. You can easily double or triple
the expected physical latency with a lower quality or lower bandwidth network link. As a result
you are suddenly within range of exceeding the limit the moment a large flood of I/O happens
that exceeds the bandwidth capacity you have in place.
When you get a 1920 error, always check the latency first. Keep in mind that the FCIP routing
layer can introduce latency if it is not properly configured. If your network provider reports a
much lower latency, this report could be an indication of a problem at your FCIP Routing layer.
Most FCIP Routing devices have built-in tools to allow you to check the RTT. When checking
latency, remember that TCP/IP routing devices (including FCIP routers) report RTT using
standard 64-byte ping packets.
Figure 9-68 shows why the effective transit time should only be measured using packets large
enough to hold a Fibre Channel frame. This packet size is 2148 bytes (2112 bytes of payload
and 36 bytes of header) and you should allow some additional capacity to be safe, as different
switching vendors have optional features that might increase this size. After you have verified
your latency using the correct packet size, proceed with normal hardware troubleshooting.
418
IBM Flex System V7000 Storage Node Introduction and Implementation Guide
physical
latency at a minimum. This latency is under the 80 ms limit of
each way
.

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents