Reliability, Availability, And Serviceability - IBM System x3850 X6 Installation And Service Manual

Hide thumbs Also See for System x3850 X6:
Table of Contents

Advertisement

Reliability, availability, and serviceability

This topic provides an overview of the server reliability, availability, and
serviceability (RAS) features.
Three important computer design features are reliability, availability, and
serviceability (RAS). The RAS features help to ensure the integrity of the data that
is stored in the server, the availability of the server when you need it, and the ease
with which you can diagnose and correct problems.
Your server has the following RAS features:
v 3-year parts and 3-year labor limited warranty (Machine Type 3837) or 4-year
v 24-hour support center
v Automatic error retry and recovery
v Automatic restart on nonmaskable interrupt (NMI)
v Automatic restart after a power failure
v Backup basic input/output system switching under the control of the integrated
v Built-in monitoring for fan, power, temperature, voltage, and power-supply
v Cable-presence detection on most connectors
v Chipkill memory protection
v Corrected machine check interrupt (CMCI)
v Single-device data correction (SDDC) for x4 DRAM technology DIMMs
v Diagnostic support for ServeRAID and Ethernet adapters
v DRAM single device data correction (SDDC)
v Dynamic memory migration
v Enhanced DRAM single device data correction (SDDC+1)
v Enhanced DRAM double device data correction (SDDC+1)
v Error codes and messages
v Error correcting code (ECC) L3 cache and system memory
v Failed DIMM identification
v Full Array Memory Mirroring (FAMM) redundancy
v Hot-swap cooling fans with speed-sensing capability
v Hot-swap hard disk drives
v Hot-swap and redundant power supplies
v Integrated baseboard management controller (BMC) subsystem
v Integrated management module (IMM)
v LCD system information display panel
v Light path LEDs for DIMMs, microprocessors, PCIe adapters, hard disk drives,
v Memory address parity protection
v Memory demand and patrol scrubbing
v Memory error correcting code and parity test
v Memory downsizing (non-mirrored memory). After a restart of the server after
v Memory mirroring and memory rank sparing support
v Memory thermal throttling
18
System x3850 X6 and x3950 X6 Types 3837 and 3839: Installation and Service Guide
parts and 4-year labor limited warranty (Machine Type 3839)
management module (IMM)
redundancy
(available on 16 GB DIMMs only). Ensures that data is available on a single x4
DRAM DIMM after a hard failure of up to two DRAM DIMMs. One x4 DRAM
DIMM in each rank is reserved as a space device.
solid state drives, power supplies, fans, PCIe modules, and I/O modules
the memory controller detects a non-mirrored uncorrectable error and the
memory controller cannot recover operationally, the IMM logs the uncorrectable
error and informs POST. POST logically maps out the memory with the
uncorrectable error, and the server restarts with the remaining installed memory.

Advertisement

Table of Contents
loading

This manual is also suitable for:

System x3950 x6

Table of Contents