Dell Force10 S4810P Configuration Manual page 443

High-density, 1ru 48-port 10gbe switch
Hide thumbs Also See for Force10 S4810P:
Table of Contents

Advertisement

C-Series RPMs have one CPU: Control Processor (CP). The CP on the RPM communicates with the LP
via IPC. Like the E-Series, the CP monitors the health status of the other processors by sending a heartbeat
message. If any CPU fails to acknowledge a consecutive number of heartbeat messages, or the CP itself
fails to send heartbeat messages (IPC timeout), the primary RPM requests a failover to the standby RPM,
and FTOS displays a message similar to
Message 4 RPM Failover due to IPC Timeout
%RPM1-P:CP %IPC-2-STATUS: target rp2 not responding
%RPM0-S:CP %RAM-6-FAILOVER_REQ: RPM failover request from active peer: Auto failover
on failure
%RPM0-S:CP %RAM-6-ELECTION_ROLE: RPM0 is transitioning to Primary RPM.
%RPM0-P:CP %TSM-6-SFM_SWITCHFAB_STATE: Switch Fabric: UP
In addition to IPC, the CP on the each RPM sends heartbeat messages to the CP on its peer RPM via a
process called Inter-RPM Communication (IRC). If the primary RPM fails to acknowledge a consecutive
number of heartbeat messages (IRC timeout), the standby RPM responds by assuming the role of primary
RPM, and FTOS displays message similar to message
Message 5 RPM Failover due to IRC Timeout
20:29:07: %RPM1-S:CP %IRC-4-IRC_WARNLINKDN: Keepalive packet 7 to peer RPM is lost
20:29:07: %RPM1-S:CP %IRC-4-IRC_COMMDOWN: Link to peer RPM is down
%RPM1-S:CP %RAM-4-MISSING_HB: Heartbeat lost with peer RPM. Auto failover on heart beat lost.
%RPM1-S:CP %RAM-6-ELECTION_ROLE: RPM1 is transitioning to Primary RPM.
IPC and IRC timeouts and failover behavior
IPC or IRC timeouts can occur because heartbeat messages and acknowledgements are lost or arrive out of
sequence, or a software or hardware failure occurs that impacts IPC or IRC.
Behaviors," in High Availability
Table 21-2. Failover Behaviors
Platform
Failover Trigger
CP task crash on the primary
c e
RPM
CP IRC timeout for a non-task
c e
crash reason on the primary RPM
RP task or kernel crash on the
e
primary RPM
Message
4.
Message
describes the failover behavior for the possible failure scenarios.
Failover Behavior
The standby RPM detects the IRC time out and initiates failover, and
the failed RPM reboots itself after saving a CP application core dump.
The standby RPM detects IRC time out and initiates failover. FTOS
saves a CP trace log, the CP IPC-related system status, and a CP
application core dump. Then the failed RPM reboots itself.
CP on the primary RPM detects the RP IPC timeout and notifies the
standby RPM. The standby RPM initiates a failover. FTOS saves an
RP application or kernel core dump, the CP trace log, and the CP
IPC-related system status. Then the new primary RPM reboots the
failed RPM.
5.
Table 21-2, "Failover
High Availability | 443

Advertisement

Table of Contents
loading

Table of Contents