Availability; Serviceability - Fujitsu Sun Oracle SPARC Enterprise M3000 Overview Manual

Hide thumbs Also See for Sun Oracle SPARC Enterprise M3000:
Table of Contents

Advertisement

2.4.2

Availability

Availability represents the ratio of time the server is accessible and usable. An
operating ratio is used as an index.
Hardware and software problems in the system cannot be eliminated completely. To
provide high availability, the system must be incorporated with mechanisms that
enable continuous system operation even if a failure occurs in hardware such as
components and devices, or in software such as the OS or business application
software.
The M3000 server provides the following functions to implement high availability:
Supports redundant configurations and active/hot replacement of power supply
units and fan units
Supports redundant configurations and active/hot replacement of hard disk
drives by RAID technology
Extends the range of automatic correction of temporary faults in memory, system
buses, and LSI internal data
Supports the enhanced retry function and degradation function for detected faults
Shortens the system downtime by using automatic system reboot
Shortens the time taken for system startup
Collects fault information by the XSCF, and provides preventive maintenance
using different types of warnings
Supports the advanced ECC in the memory subsystem, which enables single-bit
error correction to continue processing in response to continuous burst read
errors caused by memory device failures
Supports the memory patrol function implemented in hardware, that detects and
corrects memory errors without affecting software processing
In addition, combination with clustering software or operating management
software can implement higher availability.
2.4.3

Serviceability

Serviceability is characterized by how easily a server fault can be diagnosed, and
how quickly the server can be recovered from the fault or how easily the fault can be
corrected.
To implement high serviceability, it must be possible to easily determine the
components or devices that caused faults. Furthermore, to recover from failures, the
system must be able to determine the cause of the failures and isolate the faulty
Chapter 2 System Functions
2-5

Advertisement

Table of Contents
loading

Table of Contents