About Online System Health Management; System Health Initiation - Cisco AP776A - Nexus Converged Network Switch 5020 Configuration Manual

Cisco mds 9000 family cli configuration guide - release 4.x (ol-18084-01, february 2009)
Hide thumbs Also See for AP776A - Nexus Converged Network Switch 5020:
Table of Contents

Advertisement

Online System Health Management
S e n d d o c u m e n t a t i o n c o m m e n t s t o m d s f e e d b a c k - d o c @ c i s c o . c o m

About Online System Health Management

The Online Health Management System (OHMS) is a hardware fault detection and recovery feature. It
runs on all Cisco MDS switching, services, and supervisor modules and ensures the general health of
any switch in the Cisco MDS 9000 Family. The OHMS monitors system hardware in the following ways:
The OHMS application launches a daemon process in all modules and runs multiple tests on each module
to test individual module components. The tests run at preconfigured intervals, cover all major fault
points, and isolate any failing component in the MDS switch. The OHMS running on the active
supervisor maintains control over all other OHMS components running on all other modules in the
switch.
On detecting a fault, the system health application attempts the following recovery actions:
Each module is configured to run the test relevant to that module. You can change the default parameters
of the test in each module as required.

System Health Initiation

By default, the system health feature is enabled in each switch in the Cisco MDS 9000 Family.
Cisco MDS 9000 Family CLI Configuration Guide
59-10
Performing Serdes Loopbacks, page 59-16
Interpreting the Current Status, page 59-16
Displaying System Health, page 59-17
The OHMS component running on the active supervisor maintains control over all other OHMS
components running on the other modules in the switch.
The system health application running in the standby supervisor module only monitors the standby
supervisor module—if that module is available in the HA standby mode. See the
Characteristics" section on page
Performs additional testing to isolate the faulty component
Attempts to reconfigure the component by retrieving its configuration information from persistent
storage.
If unable to recover, sends Call Home notifications, system messages and exception logs; and shuts
down and discontinues testing the failed module or component (such as an interface)
Sends Call Home and system messages and exception logs as soon as it detects a failure.
Shuts down the failing module or component (such as an interface).
Isolates failed ports from further testing.
Reports the failure to the appropriate software component.
Switches to the standby supervisor module, if an error is detected on the active supervisor module
and a standby supervisor module exists in the Cisco MDS switch. After the switchover, the new
active supervisor module restarts the active supervisor tests.
Reloads the switch if a standby supervisor module does not exist in the switch.
Provides CLI support to view, test, and obtain test run statistics or change the system health test
configuration on the switch.
Performs tests to focus on the problem area.
Chapter 59
10-2.
Monitoring System Processes and Logs
"HA Switchover
OL-18084-01, Cisco MDS NX-OS Release 4.x

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents