Failure Modes; Transient Failures; Systematic Failures; Soft-State Failures - Extreme Networks BlackDiamond 6804 Troubleshooting Manual

Advanced system diagnostics and troubleshooting guide
Hide thumbs Also See for BlackDiamond 6804:
Table of Contents

Advertisement

Packet Errors and Packet Error Detection
example, the system health check facility can be configured such that ExtremeWare will enter a message
into the system log that a checksum error has been detected.

Failure Modes

Although packet errors are extremely rare events, packet errors can occur anywhere along the data path,
along the control path, or while stored in packet memory. A checksum mismatch might occur due to a
fault occurring in any of the components between the ingress and egress points—including, but not
limited to, the packet memory (SRAM), ASICs, MAC, or bus transceiver components.
There are many causes and conditions that can lead to packet error events. These causes and conditions
can fall into one of these categories:
• Transient errors
• Systematic errors
— Soft-state errors
— Permanent errors
The failure modes that can result in the above categories are described in the sections that follow.

Transient Failures

Transient failures are errors that occur as one-time events during normal system processing. These types
of errors will occur as single events, or might recur for short durations. Because these transient events
usually occur randomly throughout the network, there is usually no single locus of packet errors. They
are temporary (do not persist), do not have a noticeable impact on network functionality, and require no
user intervention to correct: There is no need to swap a hardware module or other equipment.

Systematic Failures

Systematic errors are repeatable events: some hardware device or component is malfunctioning in such a
way that it persistently exhibits incorrect behavior. In the context of the ExtremeWare Advanced System
Diagnostics, the appearance of a checksum error message in the system log—for example—indicates
that the normal error detection mechanisms in the switch have detected that the data in a packet has
been modified inappropriately. While checksums provide a strong check of data integrity, they must be
qualified according to their risk to the system and by what you can do to resolve the problem.
Systematic errors can be subdivided into two subgroups:

• Soft-state failures

• Permanent, or hard failures
Soft-State Failures
These types of error events are characterized by a prolonged period of reported error messages and
might, or might not, be accompanied by noticeable degradation of network service. These events require
user intervention to correct, but are resolved without replacing hardware.
Failures of this type are the result of software or hardware systems entering an abnormal operating state
in which normal switch operation might, or might not, be impaired.
30
Advanced System Diagnostics and Troubleshooting Guide

Advertisement

Table of Contents
loading

Table of Contents