Troubleshooting The Server Memory; Memory Dimm Load Order; Memory Subsystem Behaviors; Customer Messaging Policy - HP Integrity rx2800 - i2 User's & Service Manual

Rx2800 i2 user service guide
Hide thumbs Also See for Integrity rx2800 - i2:
Table of Contents

Advertisement

Table 36 CPU events that may light SID LEDs (continued)
Diagnostic
Sample IPMI Events
LEDs
CPUs
Type E0h, 33d:26d
BOOT_CPU_EARLY_TEST_FAIL
CPUs
Type 02h, 25h:71h:80h
MISSING_FRU_DEVICE

Troubleshooting the server memory

Memory DIMM load order

For a minimally loaded server, two equal-size DIMMs must be installed in the DIMM slots, as per
Table 14 (page

Memory subsystem behaviors

The CPU and its integrated memory controller provides increased reliability of DIMMs. The memory
controller built into the 9300 series CPU doubles memory rank error correction from 4 bytes to 8
bytes of a 128 byte cache line, during cache line misses initiated by CPU cache controllers and
by Direct Memory Access (DMA) operations initiated by I/O devices. This feature is called double
DRAM sparing, as 2 of 72 DRAMs in any DIMM pair can fail without any loss of server
performance.
Corrective action, DIMM/memory expander replacement, is required when a threshold is reached
for multiple double-byte errors from one or more DIMMs in the same rank. And when any
uncorrectable memory error (more than 2 bytes) or when no pair of like DIMMs is loaded in rank
0 of side 0. All other causes of memory DIMM errors are corrected by the CPU and reported to
the Page Deallocation Table (PDT) / diagnostic LED panel.

Customer messaging policy

Only light a diagnostic LED for memory DIMM errors when isolation is to a specific memory
DIMM. If any uncertainty about a specific DIMM, then point customer to the SEL for any action
and do not light the suspect DIMM CRU LED on the System Insight Display.
For configuration style errors, for example, no DIMMs installed in 0A and 0B, follow the HP
ProLiant policy of lighting all of the CRU LEDs on the diagnostic LED panel for all of the DIMMs
that are missing.
No diagnostic messages are reported for single-byte errors that are corrected in both ICH10
caches and DIMMs during corrected platform error (CPE) events. Diagnostic messages are
reported for CPE events when thresholds are exceeded for both single-byte and double byte
errors; all fatal memory subsystem errors cause global MCA events.
PDT logs for all double byte errors are permanent; single byte errors are initially logged as
transient errors. If the server logs 2 single byte errors within 24 hours, then upgrade them to
permanent in the PDT.
35).
Cause
Source
A logical CPU
SFW
(thread) failed
early self test
No physical
BMC
CPU cores
present
Troubleshooting the CPU and Memory
Notes
Possible
seating or
failed CPU
87

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents