High availability overview
Communication interruptions can seriously affect widely-deployed value-added services such as IPTV and
video conference. Therefore, the basic network infrastructures must be able to provide high availability.
The following are effective ways to improve availability:
Increasing fault tolerance
•
Speeding up fault recovery
•
Reducing impact of faults on services
•
Availability requirements
Availability requirements fall into three levels based on purpose and implementation, as shown in
1.
Table 1 Availability requirements
Level
Requirement
Decrease system software and
1
hardware faults
Protect system functions from being
2
affected if faults occur
Enable the system to recover as fast
3
as possible
You should consider the Level 1 availability requirement during the design and production process of
network devices. Consider Level 2 during network design. Finally, consider Level 3 during network
deployment, according to the network infrastructure and service characteristics.
Availability evaluation
MTBF and MTTR are used to evaluate the availability of a network.
MTBF
MTBF is the predicted elapsed time between inherent failures of a system during operation. It is typically
in the unit of hours. A higher MTBF means a high availability.
MTTR
MTTR is the average time required to repair a failed system. MTTR in a broad sense also involves spare
parts management and customer services.
MTTR = fault detection time + hardware replacement time + system initialization time + link recovery
time + routing time + forwarding recovery time. A smaller value of each item means a smaller MTTR and
a higher availability.
Solution
•
Hardware—Simplifying circuit design, enhancing
production techniques, and performing reliability tests
•
Software—Reliability design and test
Device and link redundancy and deployment of
switchover strategies
Performing fault detection, diagnosis, isolation, and
recovery technologies
7
Table