System Software Exception Recovery Behavior; Redundant Msm Behavior - Extreme Networks BlackDiamond 6804 Troubleshooting Manual

Advanced system diagnostics and troubleshooting guide
Hide thumbs Also See for BlackDiamond 6804:
Table of Contents

Advertisement

Software Exception Handling
The system-watchdog feature is enabled by default. The CLI commands related to system-watchdog
operation are:
enable system-watchdog
disable system-watchdog
NOTE
During the reboot cycle, network redundancy protocols will work to recover the network. The impact on
the network depends on the network topology and configuration (for example, OSPF ECMP versus a
large STP network on a single domain).
Also, if the system-watchdog feature is not enabled, error conditions might lead to extensive service
outages. All routing and redundancy protocols use the CPU to calculate proper states. Using the OSPF
ECMP and STP networks as general examples, if the CPU becomes trapped in a loop, the system in an
OSPF network would be unable to process OSPF control messages properly, causing corruption in
routing tables, while in an STP network, spanning tree BPDUs would not be processed, causing all
paths to be forwarded, leading to broadcast storms, causing not only data loss, but loss of general
connectivity as well.

System Software Exception Recovery Behavior

ExtremeWare provides commands to configure system recovery behavior when a software exception
occurs.
• Recovery behavior—
configure sys-recovery-level
• Reboot behavior—
configure reboot-loop-protection
• System dump behavior—
command, and
timeout
These commands and their uses are described in these sections:
• "Configuring System Recovery Actions" on page 40
• "Configuring Reboot Loop Protection" on page 42
• "Dumping the System Memory" on page 44

Redundant MSM Behavior

A number of events can cause an MSM failover to occur, including:
• Software exception; system watchdog timer expiry
• Diagnostic failure (extended diagnostics, transceiver check/scan, FDB scan failure/remap)
• Hot removal of the master MSM or hard-reset of the master MSM
The MSM failover behavior depends on the following factors:
• Platform type and equipage
• Software configuration settings for the software exception handling options such as system
watchdog, system recovery level, and reboot loop protection. (For more information on the
configuration settings, see Chapter 4, "Software Exception Handling.")
In normal operation, the master MSM continuously resets the watchdog timer. If the watchdog timer
expires, the slave MSM will either 1) reboot the chassis and take over as the master MSM (when the
38
configure system-dump server
command
upload system-dump
command
command
command,
configure system-dump
Advanced System Diagnostics and Troubleshooting Guide

Advertisement

Table of Contents
loading

Table of Contents