Machine Checks/Interrupts
The exceptions that result from hardware system errors are called machine
checks/interrupts. They occur when a system error is detected during the processing of a
data request. Four types of machine checks/interrupts are related to system events:
•
Processor machine check (SCB 670)
•
System machine check (SCB 660)
•
Processor-detected correctable error (SCB 630)
•
System-detected nonfatal error (SCB 620)
NOTE:
A fan failure is a fatal, non-correctable error, but is reported as nonfatal to
allow the operating system to perform shutdown.
During the error-handling process, errors are first handled by the appropriate PALcode
error routine and then by the associated operating system error handler. The causes of
each of the machine check/interrupts are as follows. The system control block (SCB)
vector through which PALcode transfers control to the operating system is shown in
parentheses.
Processor Machine Check (SCB: 670)
Processor machine check errors are fatal system errors that result in a system crash. The
error-handling code for these errors is common across all platforms using the Alpha 21164
microprocessor.
•
I-cache data or tag parity error
•
S-cache data parity error—I-stream
•
S-cache tag parity error—I-stream
•
S-cache data parity error—D-stream Read/Read, READ_DIRTY
•
S-cache tag parity error—D-stream or system commands
•
D-cache data parity error
•
D-cache tag parity error
•
I-stream uncorrectable ECC data parity errors (B-cache or memory)
•
D-stream uncorrectable ECC data parity errors (B-cache or memory)
•
B-cache tag parity errors—I-stream
•
B-cache tag parity errors—D-stream
•
System command/address parity error
Error Log Analysis
DIGITAL Server 3300/3300R 5–3