Page 1
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families Intel order number G90620-002 Revision 1.1 September 2013 Enterprise Platforms and Services Division – Marketing...
Revision History System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families Revision History Date Revision Modifications Number January 2013 Initial release September 2013 Added MIC Thermal Margin sensors C4 through C7.
Page 3
INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION...
Table of Contents System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families Table of Contents 1. Introduction .......................... 1 Purpose ........................1 Industry Standard ....................2 1.2.1 Intelligent Platform Management Interface (IPMI) ........... 2 1.2.2...
Page 5
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families Table of Contents 5.2.1 Threshold-based Temperature Sensors ..............49 5.2.2 Thermal Margin Sensors ..................51 5.2.3 Processor Thermal Control Sensors ..............53 5.2.4...
Page 6
Table of Contents System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families System Firmware Progress (Formerly Post Error) – Next Steps ......89 9.2.1 10. Chassis Subsystem ......................97 10.1 Physical Security ....................
Page 7
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families Table of Contents Node Manger Alert Threshold Exceeded – Next Steps ........123 13.5.1 14. Microsoft Windows* Records ..................124 14.1 Boot up Event Records ..................
Page 8
List of Tables System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families List of Tables Table 1. SEL Record Format ......................4 Table 2: Event Request Message Event Data Field Contents ............7 Table 3: OEM SEL Record (Type C0h-DFh) ................
Page 9
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families List of Tables Table 39: Thermal Margin Sensors Event Triggers – Description ..........52 Table 40: Thermal Margin Sensors – Next Steps ............... 52 Table 41: Processor Thermal Control Sensors Typical Characteristics ........
Page 10
List of Tables System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families Table 79: SMI Timeout Sensor Typical Characteristics ............102 Table 80: System Event Log Cleared Sensor Typical Characteristics ........103 Table 81: System Event –...
W2600CR Workstation Boards Purpose The purpose of this document is to list all possible events generated by the Intel platform. It may be possible that other sources (not under our control) also generate events, which will not be described in this document.
Introduction System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 ® ® 4600/2600/2400/1600/1400 Product Families Industry Standard 1.2.1 Intelligent Platform Management Interface (IPMI) The key characteristic of the Intelligent Platform Management Interface (IPMI) is that the inventory, monitoring, logging, and recovery control functions are available independently of the main processors, BIOS, and operating system.
The BMC allows access to SEL from in-band and out-of-band mechanisms. There are various ® tools and utilities that can be used to access the SEL. There is the Intel SELView utility and multiple open sourced IPMI tools.
Basic Decoding of a SEL Record System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Basic Decoding of a SEL Record The System Event Log (SEL) record format is defined in the IPMI Specification. The following section provides a basic definition for each of the fields in a SEL.
Page 15
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Basic Decoding of a SEL Record Byte Field Description Generator ID RqSA and LUN if event was generated from IPMB. (GID) Software ID if event was generated from system software.
Page 16
Basic Decoding of a SEL Record System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description [6:0] – Event Type Codes 01h = Threshold (States = 0x00-0x0b) 02h-0ch = Discrete...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Basic Decoding of a SEL Record Table 2: Event Request Message Event Data Field Contents Sensor Event Data Class Threshold Event Data 1 [7:6] –...
Basic Decoding of a SEL Record System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Event Data Class 11b = Reserved [5:4] – 00b = Unspecified Event Data 3...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Basic Decoding of a SEL Record Byte Field Description OEM Defined OEM Defined. This is defined according to the manufacturer identified by the Manufacturer ID field.
Basic Decoding of a SEL Record System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Notes on SEL Logs and Collecting SEL Information Whenever you capture the SEL log, you should always collect both the text/human readable version and the hex version. Because some of the data is OEM-specific, some utilities cannot decode the information correctly.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Basic Decoding of a SEL Record RID (Record ID) = 011Ah RT (Record Type) = 02h = system event record TS (Timestamp) = 4E6A4957h GID (Generator ID = 0001h = BIOS POST ER (Event Message Revision) = 04 = IPMI v2.0...
Basic Decoding of a SEL Record System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® RT (Record Type) = 02h = system event record TS (Timestamp) = 502E9B0Ah GID (Generator ID = 0033h = BIOS SMI Handler ER (Event Message Revision) = 04 = IPMI v2.0...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Cross Reference List Sensor Cross Reference List This section contains a cross reference to help find details on any specific SEL entry.
Page 24
Sensor Cross Reference List System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Sensor Name Details Section Next Steps Number BMC Watchdog BMC Watchdog Sensor – Next Steps BMC Watchdog Sensor...
Page 25
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Cross Reference List Sensor Sensor Name Details Section Next Steps Number PCI Riser 4 Temperature Threshold-based Temperature Table 37: Temperature Sensors – Next Steps...
Page 26
Sensor Cross Reference List System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Sensor Name Details Section Next Steps Number PCI Riser 2 Temperature Threshold-based Temperature Table 37: Temperature Sensors – Next Steps...
Page 27
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Cross Reference List Sensor Sensor Name Details Section Next Steps Number Power Supply 2 Temperature Table 27: Power Supply Temperature Sensor – Event Trigger Offset – Next...
Page 28
Sensor Cross Reference List System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Sensor Name Details Section Next Steps Number Processor 3 ERR2 Timeout Processor ERR2 Timeout – Next Steps...
Page 29
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Cross Reference List Sensor Sensor Name Details Section Next Steps Number Processor 1 Memory VRD Hot 0-1 Table 45: Discrete Thermal Sensors – Next Steps...
Page 30
Sensor Cross Reference List System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Sensor Name Details Section Next Steps Number Power Supply 2 Fan Tachometer 2 Power Supply Fan Tachometer Power Supply Fan Tachometer Sensors –...
Page 31
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Cross Reference List Sensor Sensor Name Details Section Next Steps Number Processor 4 DIMM Aggregate Thermal Margin 2 Table 40: Thermal Margin Sensors – Next Steps...
Page 32
Sensor Cross Reference List System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Sensor Name Details Section Next Steps Number Baseboard +3.3V Threshold-based Voltage Table 13: Threshold-based Voltage Sensors – Next Steps Sensors (BB +3.3V)
Page 33
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Cross Reference List Sensor Sensor Name Details Section Next Steps Number Baseboard +1.35V P1 Low Voltage Threshold-based Voltage Memory AB VDDQ Table 13: Threshold-based Voltage Sensors –...
Sensor Cross Reference List System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® BIOS POST owned Sensors (GID = 0001h) The following table can be used to find the details of sensors owned by BIOS POST.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Cross Reference List Sensor Sensor Name Details Section Next Steps Number ® Intel Quick Path Interface QPI Correctable Error Sensor – Next Steps...
Sensor Cross Reference List System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Microsoft* OS owned Events (GID = 0041) The following table can be used to find the details of records that are owned by the Microsoft* Operating System (OS).
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems Power Subsystems The BMC monitors the power subsystem including power supplies, select onboard voltages, and related sensors. Threshold-based Voltage Sensors The BMC monitors the main voltage sources in the system, including the baseboard, memory, and processors, using IPMI-compliant analog/threshold sensors.
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Table 12: Threshold-based Voltage Sensors Event Triggers – Description Event Trigger Assertion Deassert Description Description Severity Severity Lower non-critical Degraded The voltage has dropped below its lower non-critical threshold.
Page 39
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems Sensor Sensor Name Next Steps Number +12V is supplied by the power supplies. +12V is used by SATA drives, Fans, and PCI cards. In addition it is used to generate various processor voltages.
Page 40
Next Steps Number +3.3V AUX is supplied by the main board. ® +3.3V AUX is used by the BMC, clock chips, PCI-E Slot, on-board NIC, Intel C600 series Chipset, and ICH. Baseboard +3.3V Auxiliary Ensure all cables are connected correctly.
Page 41
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems Sensor Sensor Name Next Steps Number This 1.5V line is supplied by the main board. This 1.5V line is used by processor 2 memory slots A and B.
Page 42
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Sensor Name Next Steps Number This 1.35V line is supplied by the main board. This 1.35V line is used by processor 1 memory slots C and D.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems Sensor Sensor Name Next Steps Number +0.9V Core IB is supplied by the main board on specific platforms. +0.9V Core IB is used by the on-board Infiniband* controller on those specific platforms.
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® If the SystemPowerGood signal has not asserted by the time the VR Watchdog Timer expires, the FW powers down the system, logs a SEL entry, and emits a beep code (1-5-1-2).
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems Table 15: Power Unit Status Sensors Typical Characteristics Byte Field Description Sensor Type 09h = Power Unit Sensor Number...
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Specific Offset Description Next Steps Description Soft Power Control Generally means power good was lost in This could be cause by the power supply subsystem or system Failure the system, causing a shutdown.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems Table 18: Power Unit Redundancy Sensor – Event Trigger Offset – Next Steps Event Trigger Offset Description Next Steps...
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description [6:0] Event Type = 03h (“digital” discrete) [7:6] – 00b = Unspecified Event Data 2 Event Data 1 [5:4] –...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems Byte Field Description [6:0] Event Type = 6Fh (Sensor Specific) [7:6] – ED2 data in Table 21 Event Data 1 [5:4] –...
Page 50
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Sensor Specific Offset Description Next Steps Description Predictive Check the data in ED2 10b = OEM code in Event Data 2...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems 4.4.2 Power Supply Power In Sensors These sensors will log an event when a power supply in the system is exceeding its AC power in threshold.
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 4.4.3 Power Supply Current Out % Sensors PMBus*-compliant power supplies may monitor the current output of the main 12v voltage rail and report the current usage as a percentage of the maximum power output for that rail.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Power Subsystems 4.4.4 Power Supply Temperature Sensors The BMC monitors one or two power supply temperature sensors for each installed PMBus*-compliant power supply.
Power Subsystems System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 4.4.5 Power Supply Fan Tachometer Sensors The BMC polls each installed power supply using the PMBus* fan status commands to check for failure conditions for the power supply fans.
Cooling Subsystem Cooling Subsystem Fan Sensors ® There are three types of fan sensors that can be present on Intel Server Systems: speed, presence, and redundancy. The last two are only present in the systems with hot-swap redundant fans. 5.1.1 Fan Tachometer Sensors Fan tachometer sensors monitor the rpm signal on the relevant fan headers on the platform.
Fan presence sensors are only implemented for hot-swap fans, and require an additional pin on the fan header. Fan redundancy is ® an aggregate of the fan presence sensors and will warn when redundancy is lost. Typically the redundancy mode on Intel servers is an n+1 redundancy (if one fan fails there are still sufficient fans to cool the system, but it is no longer redundant) although other modes are also possible.
Cooling Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description [7] Event direction 0b = Assertion Event Event Direction and Event Type 1b = Deassertion Event [6:0] Event Type = 0Bh (Generic Discrete) [7:6] –...
® ® Cooling Subsystem Temperature Sensors ® There are a variety of temperature sensors that can be implemented on Intel Server Systems. They are split into various types each with their own events that can be logged. Threshold-based Temperature ...
Cooling Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description Event Data 3 Threshold value that triggered event Table 36: Temperature Sensors Event Triggers – Description...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Cooling Subsystem Sensor Sensor Name Next Steps Number Baseboard Temperature 2 Baseboard Temperature 3 Baseboard Temperature 4 I/O Mod Temp PCI Riser 1 Temp...
Cooling Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description Event Direction and [7] Event direction Event Type 0b = Assertion Event 1b = Deassertion Event [6:0] Event Type = 01h (Threshold) [7:6] –...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Cooling Subsystem Sensor Sensor Name Next Steps Number Ensure the air used to cool the system is within the thermal specifications for the system (typically below 35° C).
Xeon processor E5-4600/2600/2400/1600 v2 product families are incorporating a DTS-based thermal spec. This allows a ® much more accurate control of the thermal solution and enables lower fan speeds and lower fan power consumption. For Intel ® Xeon processor E5-4600/2600/2400/1600 product families, this requires significant BMC FW calculations to derive the sensor value.
Cooling Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Table 44: Discrete Thermal Sensors Typical Characteristics Byte Field Description Sensor Type 01h = Temperature Sensor Number See Table 45...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Cooling Subsystem Sensor Sensor Name Event Event Trigger Offset Description Next Steps Number Type Description P2 Mem23 VRD Hot Processor 2 Memory 2/3 voltage regulator...
Cooling Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description [3:0] – Event Trigger Offset = 0A = Critical over temperature Event Data 2 Not used...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Processor Subsystem Processor Subsystem ® Intel servers report multiple processor-centric sensors in the SEL. Processor Status Sensor The BMC provides an IPMI sensor of type processor for monitoring status information for each processor slot. If an event state (sensor offset) has been asserted, it remains asserted until one of the following happens: ...
Processor Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Table 48: Processor Status Sensors – Next Steps Event Trigger Processor Status Next Steps Offset Cross test the processors.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Processor Subsystem Catastrophic Error Sensor When the Catastrophic Error signal (CATERR#) stays asserted, it is a sign that something serious has gone wrong in the hardware.
Processor Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Description Next Steps This error is typically caused by other platform components. Check for other errors near the time of the CATERR event.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Processor Subsystem 6.3.1 CPU Missing Sensor – Next Steps Verify the processor is installed in the correct slot. Quick Path Interconnect Sensors ®...
Processor Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description 1h = Reduced to ½ width 2h = Reduced to ¼ width Event Data 2 0-3 = CPU1-4...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Processor Subsystem Byte Field Description Event Data 2 0-3 = CPU1-4 Event Data 3 Not used 6.4.2.1 QPI Correctable Error Sensor – Next Steps This is an Informational event only.
2. Take note of all IPMI activity that was occurring around the time of the failure. Capture a System BMC Debug Log as soon as you ® can after experiencing this failure. This log can be captured from the Integrated BMC Web Console or by using the Intel Syscfg utility (syscfg /sbmcdl private filename.zip).
Memory Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Memory Subsystem ® Intel servers report memory errors, status, and configuration in the SEL. Memory RAS Configuration Status A Memory RAS Configuration Status event is logged after an AC power-on occurs, only if any RAS Mode is currently configured, and only if RAS Mode is successfully initiated.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Memory Subsystem Byte Field Description Event Direction and [7] Event direction Event Type 0b = Assertion Event 1b = Deassertion Event [6:0] Event Type = 09h (digital Discrete) [7:6] –...
Memory Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Memory RAS Mode Select Memory RAS Mode Select events are logged to record changes in RAS Mode. When a RAS Mode selection is made that changes the RAS Mode (including selecting a RAS Mode from or to Independent Channel Mode), that change is logged to SEL in a Memory RAS Mode Select event message, which records the previous RAS Mode (from) and the newly selected RAS Mode (to).
Memory Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 7.4.1 Sparing Redundancy State Sensor – Next Steps This event is accompanied by memory errors indicating the source of the issue. Troubleshoot accordingly (probably replace affected DIMM).
Page 87
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Memory Subsystem Byte Field Description [5:4] – 10b = OEM code in Event Data 3 [3:0] – Event Trigger Offset as described in Table 64 [7:2] –...
Memory Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Event Trigger Offset Description Next Steps Description Consider replacing the DIMM as a preventative measure. For multiple occurrences, replace the DIMM.
Page 89
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Memory Subsystem Byte Field Description 1b = DIMM Slot ID in Event Data 3 Bits[2:0] is valid [2:0] – Error Type:...
Page 90
Memory Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 4. Inspect the processor socket this DIMM is connected to for bent pins, and if found, replace the board.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® PCI Express* and Legacy PCI Subsystem PCI Express* and Legacy PCI Subsystem The PCI Express* (PCIe) Specification defines standard error types under the Advanced Error Reporting (AER) capabilities. The BIOS logs AER events into the SEL.
PCI Express* and Legacy PCI Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description 4h = PCI PERR 5h = PCI SERR Event Data 2 PCI Bus number [7:3] –...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® PCI Express* and Legacy PCI Subsystem Byte Field Description [7:6] – 10b = OEM code in Event Data 2 Event Data 1 [5:4] –...
PCI Express* and Legacy PCI Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description Sensor Number Event Direction and [7] Event direction Event Type 0b = Assertion Event...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® PCI Express* and Legacy PCI Subsystem Table 69: PCI Express* Correctable Error Sensor Typical Characteristics Byte Field Description Generator ID 0033h = BIOS SMI Handler...
Page 96
PCI Express* and Legacy PCI Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 8.1.3.1 PCI Express* Correctable Error Sensor – Next Steps This is an informational event only. Correctable errors are acceptable and normal at a low rate of occurrence. If the error continues: 1.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® System BIOS Events System BIOS Events There are a number of events that are owned by the system BIOS. These events can occur during Power On Self Test (POST) or when coming out of a sleep state.
System BIOS Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® The timestamp clock synchronization is run and the events are logged by the BIOS POST every time the system boots. In addition during the shutdown from some Operating Systems the BIOS SMI Handler is called to run timestamp clock synchronization and log the events.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® System BIOS Events System Firmware Progress (Formerly Post Error) The BIOS logs any POST errors to the SEL. The 2-byte POST code gets logged in the ED2 and ED3 bytes in the SEL entry. This event will be logged every time a POST error is displayed.
System BIOS Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Table 72: POST Error Codes Error Code Error Message Response 0012 System RTC date/time not set Major 0048...
Page 101
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® System BIOS Events Error Code Error Message Response 8190 Watchdog timer failed on last boot Major 8198 OS boot watchdog timer failure...
Page 102
System BIOS Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Error Code Error Message Response 8534 DIMM_G3 failed test/initialization Major 8535 DIMM_H1 failed test/initialization Major 8536 DIMM_H2 failed test/initialization...
Page 103
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® System BIOS Events Error Code Error Message Response 8553 DIMM_G2 disabled Major 8554 DIMM_G3 disabled Major 8555 DIMM_H1 disabled Major 8556...
Page 104
System BIOS Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Error Code Error Message Response 8572 DIMM_G1 encountered a Serial Presence Detection (SPD) failure Major 8573 DIMM_G2 encountered a Serial Presence Detection (SPD) failure...
Page 105
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® System BIOS Events Error Code Error Message Response 85D1 DIMM_M1 disabled Major 85D2 DIMM_M2 disabled Major 85D3 DIMM_M3 disabled Major 85D4...
Page 106
System BIOS Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Error Code Error Message Response 8605 BIOS Settings are corrupted Major 8606 NVRAM variable space was corrupted and has been reinitialized...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Chassis Subsystem 10. Chassis Subsystem The BMC monitors several aspects of the chassis. Next to logging when the power and reset buttons get pressed, the BMC also monitors chassis intrusion if a chassis intrusion switch is included in the chassis, as well as looking at the network connections, and logging an event whenever the physical network link is lost.
Chassis Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description Event Data 2 Not used Event Data 3 Not used Table 74: Physical Security Sensor Event Trigger Offset – Next Steps...
Chassis Subsystem System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 10.3 Button Sensor The BMC logs when the front panel power and reset buttons get pressed. This is purely for informational purposes and these events do not indicate errors.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Miscellaneous Events 11. Miscellaneous Events The miscellaneous events section addresses sensors not easily grouped with other sensor types. 11.1 IPMI Watchdog EPSD server systems support an IPMI watchdog timer, which can check to see whether the OS is still responsive.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Miscellaneous Events 11.2.1 SMI Timeout – Next Steps This event normally only occurs after another more critical event. 1. Check the SEL for any critical interrupts, memory errors, bus errors, PCI errors, or any other serious errors.
This is functionality built into the BMC to allow it to send alerts (SNMP or other) for any event that gets logged to the SEL. PEF filters ® ® are turned off by default and have to be enabled manually using Intel deployment assistant, Intel syscfg utility, or an IPMI-aware utility.
2. Take note of all IPMI activity that was occurring around the time of the failure. Capture a System BMC Debug Log as soon as you ® can after experiencing this failure. This log can be captured from the Integrated BMC Web Console or by using the Intel Syscfg utility (syscfg /sbmcdl private filename.zip).
2. Take note of all IPMI activity that was occurring around the time of the failure. Capture a System BMC Debug Log as soon as you ® can after experiencing this failure. This log can be captured from the Integrated BMC Web Console or by using the Intel Syscfg utility (syscfg /sbmcdl private filename.zip).
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Miscellaneous Events 11.7 Firmware Update Status Sensor The BMC FW supports a single Firmware Update Status sensor. This sensor is used to generate SEL events related to update of embedded firmware on the platform.
Miscellaneous Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 11.8 Add-In Module Presence Sensor Some server boards provide dedicated slots for add-in modules/boards (for example, SAS, IO, and PCIe-riser). For these boards the BMC provides an individual presence sensor to indicate whether the module/board is installed.
The BMC then instantiates its own version of this sensor, which is used for fan speed control. ® The thermal margin sensor is the difference between the Core Temp sensor value and the TControl value reported by the Intel Xeon ™...
Page 120
Miscellaneous Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description Event Direction and [7] Event direction Event Type 0b = Assertion Event 1b = Deassertion Event [6:0] Event Type = 70h (OEM defined) [7:6] –...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Hot-Swap Controller Backplane Events 12. Hot-Swap Controller Backplane Events ® ® All new EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600 Product Families backplanes follow a hybrid architecture, in which the IPMI functionality previously supported in the HSC is integrated into the BMC FW.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Hot-Swap Controller Backplane Events Byte Field Description Event Data 3 Not used Table 90: Hard Disk Drive Monitoring Sensor - Event Trigger Offset – Next Steps...
Hot-Swap Controller Backplane Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description Event Data 2 Not used Event Data 3 Not used 12.3.1 HSC Health Sensor – Next Steps Ensure that all connections to the HSC are well seated.
In the following table Event Data 3 is only noted for specific errors. If the issue continues to be persistent, provide the content of Event Data 3 to Intel support team for interpretation. Event Data 3 codes are in general not documented, because their meaning only provides some clues, varies, and usually needs to be individually interpreted.
Manageability Engine (ME) Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Table 93: ME Firmware Health Event Sensor – Next Steps Description Next Steps ® ME =1 – Image execution failed.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Manageability Engine (ME) Events 13.2 Node Manager Exception Event A Node Manager Exception Event will be sent each time maintained policy power limit is exceeded over Correction Time Limit.
13.3 Node Manager Health Event ® Intelligent Power Node Manager’s health. A Node Manager Health Event message provides a runtime error indication about Intel Types of service that can send an error are defined as follows: Misconfigured policy Error reading power data ...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Manageability Engine (ME) Events Byte Field Description If Error type = 11 <Power Sensor Address> If Error type = 12 <Inlet Sensor Address>...
Manageability Engine (ME) Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 13.4 Node Manager Operational Capabilities Change ® Intelligent Power Node Manager’s operational capabilities. This applies This message provides a runtime error indication about Intel to all domains.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Manageability Engine (ME) Events 13.4.1 Node Manager Operational Capabilities Change – Next Steps ® Policy Interface available indicates that Intel Intelligent Power Node Manager is able to respond to the external interface about ®...
Manageability Engine (ME) Events System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 13.5 Node Manger Alert Threshold Exceeded Policy Correction Time Exceeded Event will be sent each time maintained policy power limit is exceeded over Correction Time Limit.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Manageability Engine (ME) Events 13.5.1 Node Manger Alert Threshold Exceeded – Next Steps First occurrence of not acknowledged event will be retransmitted no faster than every 300 milliseconds.
Microsoft Windows* Records System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 14. Microsoft Windows* Records With Microsoft Windows Server 2003* R2 and later versions, an Intelligent Platform Management Interface (IPMI) driver was added.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Microsoft Windows* Records Table 99: Boot up OEM Event Record Typical Characteristics Byte Field Description Record ID ID used for SEL Record access [7:0] –...
Microsoft Windows* Records System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 14.2 Shutdown Event Records When the system shuts down from the Microsoft Windows* OS, multiple events can be logged. The first is an OS Stop/Shutdown Event Record;...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Microsoft Windows* Records Byte Field Description IPMI Manufacturer 0137h (311d) = IANA enterprise number for Microsoft Record ID Sequential number reflecting the order in which the records are read. The numbers start at 1 for the first entry in the SEL and continue sequentially to n, the number of entries in the SEL.
Microsoft Windows* Records System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Byte Field Description Shutdown Comment Shutdown Comment from the registry (LSB first): HKLM/Software/Microsoft/Windows/CurrentVersion/Reliability/shutdown/Comment Reserved 14.3 Bug Check / Blue Screen Event Records When the system experiences a bug check (blue screen), multiple records will be written to the event log.
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Microsoft Windows* Records Table 104: Bug Check/Blue Screen code OEM Event Record Typical Characteristics Byte Field Description Record ID ID used for SEL Record access [7:0] –...
Linux* Kernel Panic Records System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® 15. Linux* Kernel Panic Records The Open IPMI driver supports the ability to put semi-custom and custom events in the system event log if a panic occurs. If you enable the “Generate a panic event to all BMCs on a panic”...
System Event Log Troubleshooting Guide for EPSD Platforms Based on Intel Xeon Processor E5 4600/2600/2400/1600/1400 Product Families ® ® Linux* Kernel Panic Records Table 106: Linux* Kernel Panic String Extended Record Characteristics Byte Field Description Record ID ID used for SEL Record access [7:0] –...