Software Resiliency; Software Component Health Monitoring; System Health Monitoring; Failure And Event Logging - Dell S4820T Configuration Manual

Hide thumbs Also See for S4820T:
Table of Contents

Advertisement

Dell Networking OS supports graceful restart for the following protocols:
Border gateway
Open shortest path first
Protocol independent multicast — sparse mode
Intermediate system to intermediate system

Software Resiliency

During normal operations, Dell Networking OS monitors the health of both hardware and software
components in the background to identify potential failures, even before these failures manifest.

Software Component Health Monitoring

On each of the line cards and the stack unit, there are a number of software components. Dell
Networking OS performs a periodic health check on each of these components by querying the status of
a flag, which the corresponding component resets within a specified time.
If any health checks on the stack unit fail, the Dell Networking OS fails over to standby stack unit. If any
health checks on a line card fail, Dell Networking OS resets the card to bring it back to the correct state.

System Health Monitoring

Dell Networking OS also monitors the overall health of the system.
Key parameters such as CPU utilization, free memory, and error counters (for example, CRC failures and
packet loss) are measured, and after exceeding a threshold can be used to initiate recovery mechanism.

Failure and Event Logging

Dell Networking systems provide multiple options for logging failures and events.
Trace Log
Developers interlace messages with software code to track the execution of a program.
These messages are called trace messages and are primarily used for debugging and to provide lower-
level information then event messages, which system administrators primarily use. Dell Networking OS
retains executed trace messages for hardware and software and stores them in files (logs) on the internal
flash.
NV Trace Log — contains line card bootup trace messages that Dell Networking OS never overwrites
and is stored in internal flash under the directory NVTRACE_LOG_DIR.
Trace Log — contains trace messages related to software and hardware events, state, and errors.
Trace Logs are stored in internal flash under the directory TRACE_LOG_DIR.
Crash Log — contains trace messages related to IPC and IRC timeouts and task crashes on line cards
and is stored under the directory CRASH_LOG_DIR.
High Availability (HA)
405

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents