Reliability, Availability, And Serviceability - IBM System x3755 M3 7164 Installation And User Manual

Type 7164
Hide thumbs Also See for System x3755 M3 7164:
Table of Contents

Advertisement

Reliability, availability, and serviceability

Three important server design features are reliability, availability, and serviceability
(RAS). The RAS features help to ensure the integrity of the data that is stored in
the server, the availability of the server when you need it, and the ease with which
you can diagnose and correct problems.
The server has the following RAS features:
v Advanced memory features:
v Automatic BIOS recovery (ABR) for UEFI
v Automatic error retry and recovery
v Automatic restart after a power failure
v Availability of microcode and diagnostic levels
v Integrated baseboard management controller (service processor)
v Built-in, menu-driven electrically erasable programmable ROM (EEPROM) based
v Built-in monitoring for fan, power, temperature, voltage, and power-supply
v Error codes and messages
v Error correcting code (ECC) L2 cache and system memory
v Fault-resistant startup
v Hot-swap hard disk drives
v IBM Systems Director workgroup-hardware-management tool
v Information and light path diagnostics LED panels
v Service processor adapter for remote systems management
v Parity checking on the SAS bus and PCI Express buses
v Power managed and Advanced Configuration and Power Interface (ACPI)
v Power-on self-test (POST)
v Predictive Failure Analysis (PFA) alerts
v Redundant hot-swap capability
v Remind button to temporarily flash the system-error LED
v Remote system problem-determination support
v ROM-based diagnostic programs
v Standby voltage for systems-management features and monitoring
v Startup (boot) from LAN using Preboot Execution Environment (PXE) protocol
v System auto-configuring from the configuration menu
v System error logging
v Upgradeable microcode for POST, iBMC, diagnostics, service processor, and
v Vital product data (VPD) on microprocessors, system boards, power supplies,
v Wake on LAN capability
10
IBM System x3755 M3 Type 7164: Installation and User's Guide
– Single-bit memory error detection
– Single-bit memory error hardware correction
– Multi single-bit memory error recovery and corrections
– Uncorrectable error (UE) detection
– Full array memory mirroring (FAMM) redundancy
– Automatic failover recovery for UEs when FAMM is configured
– Automated logical removal of failed DIMMs on reboots prior to replacement
– Automatic address parity checking during writes and reads
setup, system configuration, and diagnostic programs
redundancy
compliant
– Cooling fans with speed-sensing capability (depending on the model)
– Power supplies
read-only memory (ROM) resident code, locally or over the LAN
and SAS (hot-swap-drive) backplane

Advertisement

Table of Contents
loading

Table of Contents