Reliability, Availability, And Serviceability - IBM x3850 X5 7145 Installation And User Manual

Table of Contents

Advertisement

v Redundant connection
v Redundant cooling and power capabilities
v ServeRAID support
v Symmetric multiprocessing (SMP)

Reliability, availability, and serviceability

Three important server design features are reliability, availability, and serviceability
(RAS). The RAS features help to ensure the integrity of the data that is stored in
the server, the availability of the server when you need it, and the ease with which
you can diagnose and correct problems.
The server has the following RAS features:
v Advanced memory features:
v Automatic BIOS recovery (ABR) for UEFI
v Automatic error retry and recovery
v Automatic restart after a power failure
v Availability of microcode and diagnostic levels
v Integrated management module (service processor)
The DSA Preboot diagnostic programs are stored in integrated USB memory and
collect and analyze system information to aid in diagnosing server problems. The
diagnostic programs collect the following information about the server:
– Event logs for ServeRAID controllers and service processors
– Hard disk drive health
– Installed hardware
– Light path diagnostics status
– Network interfaces and settings
– RAID controller configuration
– Service processor status and configuration
– System configuration
– Vital product data, firmware, and UEFI configuration
For additional information about DSA, see the Problem Determination and
Service Guide on the IBM Documentation CD.
The addition of an optional network interface card (NIC) provides a failover
capability to a redundant Ethernet connection. If a problem occurs with the
primary Ethernet connection, all Ethernet traffic that is associated with the
primary connection is automatically switched to the redundant NIC. If the
applicable device drivers are installed, this switching occurs without data loss and
without user intervention.
The redundant cooling of the fans in the server enables continued operation if
one of the fans fails. The server supports up to two hot-swap power supplies,
which provide redundant power for many server configurations.
The server supports ServeRAID controllers to create redundant array of
independent disks (RAID) configurations.
The server supports up to four multi-core Intel Xeon microprocessors. One or
more multi-core microprocessors provides SMP capability.
– Single-bit memory error detection
– Single-bit memory error hardware correction
– Multi single-bit memory error recovery and corrections
– Uncorrectable error (UE) detection
– Full array memory mirroring (FAMM) redundancy
– Automatic failover recovery for UEs when FAMM is configured
– Automated logical removal of failed DIMMs on reboots prior to replacement
– Automatic address parity checking during writes and reads
Chapter 1. The System x3850 X5 and x3950 X5 server
9

Hide quick links:

Advertisement

Table of Contents
loading

This manual is also suitable for:

X3850 x5 7146X3950 x5 7145X3950 x5 7146

Table of Contents