Customer Messaging Policy - HP AB464-9003F Service Manual

Hpe integrity rx6600 server user
Table of Contents

Advertisement

Customer Messaging Policy

Only light a diagnostic LED for memory DIMM errors when isolation is to a specific memory
DIMM. If any uncertainty about a specific DIMM, point customer to the SEL for any action
and do not light the suspect DIMM CRU LED on the diagnostic panel.
For configuration style errors, for example, no memory DIMMs installed in rank 0 of side 0,
follow the Hewlett Packard Enterprise policy of lighting all of the CRU LEDs on the diagnostic
LED panel for all of the DIMMs that are missing.
No diagnostic messages are reported for single-byte errors that are corrected in both zx2
caches and memory DIMMs during corrected platform error (CPE) events. Diagnostic
messages are reported for CPE events when thresholds are exceeded for both single-byte
and double byte errors; all fatal memory subsystem errors cause global MCA events.
PDT logs for all double byte errors will be permanent; single byte errors will initially be logged
as transient errors. If the server logs 2 single byte errors within 24 hours, upgrade them to
permanent in the PDT.
Table 54
lists the memory subsystem evens that light the diagnostic panel LEDs.
Table 54 Memory Subsystem Events That Light Diagnostic Panel LEDs
Diagnostic
LEDs
Memory
Carriers
DIMMs
DIMMs
DIMMs
Table 55
lists the memory subsystem evens that may light the diagnostic panel LEDs.
Table 55 Memory Subsystem Events That May Light Diagnostic Panel LEDs
Diagnostic
LEDs
Processor
Carrier
Processor
Carrier
Processor
Carrier
Sample IPMI Events
Type 02h, 02h:07h:03h
VOLTAGE_DEGRADES_TO_NON_RECOVERABLE
Type E0h, 208d:04d
MEM_NO_DIMMS_INSTALLED
Type E0h, 172d:04d
MEM_DIMM_SPD_CHECKSUM
Type E0h, 4652d:26d
WIN_AGT_PREDICT_MEM_FAIL
Sample IPMI Events
Type E0h, 189d:26d
MEM_ERR_LOG_FAILED_TO_CLEAR
Type E0h, 181d:26d
MEM_ECC_MBE_SIGNAL_TST_FAILED
Type E0h, 160d:26d MEM_BIB_REG_FAILURE
Cause
Source
Voltage on memory
BMC
expander is
inadequate
No memory DIMMs
SFW
installed (in rank 0 of
cell 0)
A DIMM has a serial
SFW
presence detect
(SPD) EEPROM with
a bad checksum
This memory rank is
WIN
correcting too many
Agent
single-bit errors
Cause
Source
Unable to clear the
SFW
platform error logs in
CEC
Self-test of CEC
SFW
multi-bit error
signaling has failed
The CEC failed the
SFW
register test
Notes
A voltage on the
memory
expander is out
of range (likely
too low)
Light all DIMM
LEDs in rank 0
of cell 0
Either EEPROM
is
misprogrammed
or this DIMM is
incompatible
Memory rank is
about to fail or
environmental
conditions are
causing more
errors than
usual
Notes
CPU/Memory/SBA 159

Advertisement

Table of Contents
loading

Table of Contents