Reliability, Availability, And Serviceability - IBM System x3650 M5 Installation And Service Manual

Type 5462
Table of Contents

Advertisement

Information Center at http://pic.dhe.ibm.com/infocenter/director/pubs/
index.jsp?topic=%2Fcom.ibm.director.main.helps.doc%2Ffqm0_main.html, and the
Systems Management website at http://www.ibm.com/systems/management/,
which presents an overview of IBM Systems Management and IBM Systems
Director.

Reliability, availability, and serviceability

Three important computer design features are reliability, availability, and
serviceability (RAS). The RAS features help to ensure the integrity of the data that
is stored in the server, the availability of the server when you need it, and the ease
with which you can diagnose and correct problems.
Your server has the following RAS features:
v 3-year parts and 3-year labor limited warranty (Machine Type 5462)
v 24-hour support center
v Automatic error retry and recovery
v Automatic restart on nonmaskable interrupt (NMI)
v Automatic restart after a power failure
v Backup basic input/output system switching under the control of the integrated
v Built-in monitoring for fan, power, temperature, voltage, and power-supply
v Cable-presence detection on most connectors
v Chipkill memory protection
v Double-device data correction (DDDC) for x4 DRAM technology DIMMs.
v Diagnostic support for ServeRAID and Ethernet adapters
v Error codes and messages
v Error correcting code (ECC) L3 cache and system memory
v Full Array Memory Mirroring (FAMM) redundancy
v Hot-swap cooling fans with speed-sensing capability
v Hot-swap hard disk drives
v Information and LCD system information display panel
v Integrated Management Module (IMM)
v LCD system information display panel for memory DIMMs, microprocessors,
v Memory mirroring and memory sparing support
v Memory error correcting code and parity test
v Memory down sizing (non-mirrored memory). After a restart of the server after
v Menu-driven setup, system configuration, and redundant array of independent
v Microprocessor built-in self-test (BIST), internal error signal monitoring, internal
v Nonmaskable interrupt (NMI) button
v Parity checking on the small computer system interface (SCSI) bus and PCI-E
management module (IMM)
redundancy
Ensures that data is available on a single x4 DRAM DIMM after a hard failure of
up to two DRAM DIMMs. One x4 DRAM DIMM in each rank is reserved as a
space device.
hard disk drives, solid state drives, power supplies, and fans
the memory controller detected a non-mirrored uncorrectable error and the
memory controller cannot recover operationally, the IMM logs the uncorrectable
error and informs POST. POST logically maps out the memory with the
uncorrectable error, and the server restarts with the remaining installed memory.
disks (RAID) configuration programs
thermal trip signal monitoring, configuration checking, and microprocessor and
voltage regulator module failure identification through LCD system information
display panel.
and PCI buses
Chapter 1. The System x3650 M5 server
17

Advertisement

Table of Contents
loading

Table of Contents