High Availability - IBM SAN768B Installation, Service And User Manual

Hide thumbs Also See for SAN768B:
Table of Contents

Advertisement

High availability

The following features contribute to the SAN768B's high-availability design:
v Redundant, hot-swappable blades and FRUs
v Enhanced data integrity on all data paths
v Fabric Shortest Path First (FSPF) rerouting around failed links
v Integration with Simple Network Management Protocol (SNMP) managers
v Automatic control processor failover
v Nondisruptive "hot" software code loads and activation
v Easy configuration, save, and restore
v Hot-swappable World Wide Name (WWN) cards
The high-availability software architecture of the SAN768B provides a common
framework for all applications that reside on the system, allowing global and local
status to be maintained through any component failure. High-availability elements
consist of the High Availability Manager, the heartbeat, the fault/health framework,
the replicated database, initialization, and software upgrade.
The High Availability Manager controls access to the standby control processor,
facilitates software upgrades, prevents extraneous switchover activity, closes and
flushes streams as needed, provides flow control and message buffering, and
supports a centralized active and standby state.
Reliability
The SAN768B uses the following error detection and correction mechanisms to
ensure reliability of data:
v Data is protected by the Error Detection and Correction mechanism, which
v Power-on self test (POST)
v Dual control processors that enable hot, nondisruptive fast firmware upgrades
v Each control processor contains one serial port and two Ethernet ports, for
v Bus monitoring and control of blades and other field-replaceable units (FRUs).
Serviceability
The SAN768B provides the following features to enhance and ensure serviceability:
v Modular design with hot-swappable components
v Flash memory that stores two firmware images per control processor
v Nonvolatile random-access memory (NVRAM), containing the OEM serial
v Background health-check daemon
v Memory scrubber, self test, and bus ping to determine if a bus is not functioning
v RASlog messages
v SMI-S compliant
v Watchdog timers
6
SAN768B Installation, Service, and User's Guide
checks for encoder errors and fault isolation , such as cyclic redundancy
checking (CRC), parity checking, checksum, and illegal address checking
management and for service. Offline control processor diagnostics and remote
diagnostics simplify troubleshooting. The standby control processor monitors
diagnostics to ensure it is operational, should a failover be necessary
number, IBM serial number, revision information, and part number information

Advertisement

Table of Contents
loading

Table of Contents