FATAL MAINFRAME ERRORS.
A fatal mainframe error is a hardware error which, if
undetected, is almost certain to cause a serious system
malfunction and disrupt current user job processing. Many
of these errors can be detected and are reported in the SIC
register of a CYBER
170
Series mainframe. The steps
taken by the system upon detection of a fatal mainframe
error depend on the type of error which was found.
Fatal errors can
be
divided into two groups, general errors
and specific job errors. General errors are detected only
because they have been recorded in the SIC register.
It
is
very difficult to trace a general error to a single job.
Specific job errors are those which cause an error exit from
the job and are thus easily traced.
The indicated SIC
register bits are set to detect the following fatal errors on
CYBER
171, 172, 173, 174, ,175, 720, 730, 750,
and
760
mainframes.
General Errors
CSU address parity error
CSU faults
PP stop on C M read error
PP stop on PP parity error
Specific Job Errors
Double SECDED error
CMC input parity error
SIC Register Bits Set
1, 2t·
8t,9t
0, 3, 14-39, 119, 183
14-39
SIC Register Bits Set
3, 183
5, 54, 55, 139
For a CYBER
176
mainframe, the indicated SIC register
bits are set to detect the following fatal errors.
General Errors
PPU error
PP stop on CM read error
PP stop on PP
pari~y
error
Specific Job Errors
Double SECDED error
LCME double SECDED error
SIC Register Bits Set
4
3, 14-39, 183
14-39
SIC Register Bits Set
3, 183
11, 196
If the error detected is a specific job error, the system
takes the following steps.
1.
The system is checkpointed.
2.
The job containing the error is aborted without
exit processing or a dump.
3.
The contents of the SIC register is entered in the
error log.
4.
A system checkpoint is performed.
Following this step, action is the same regardless of the
type of error detected. The system assumes step modett
and the message
FATAL MAINFRAME ERROR.
appears at the system control point on the job status (B)
display.
Perform a level
3
recovery deadstart to display the SIC
register display.
For each SIC
register bit set, a
descriptive message appears on the screen. The system
clears each fatal error bit automatically when you activate
the deadstart swi tch.
The analyst or customer engineer should then reconfigure
memory to eliminate the faulty hardware and prevent
further
occurrences.
(See
the
appropriate
hardware
reference
manual
for
reconfiguration
procedures.)
Following reconfiguration, another deadstart is necessary.
If
the error detected was of a
g~neral
type, perform a level
o
initial deadstart. If the error was a specific job type
error, attempt a level 1 recovery deadstart; the system
resumes operation from the point of the malfunction. If
level l' recovery deadstart fails, perform a. level 0 initial
deadstart.
t
For CYBER
720, 730, 750,
and
760
mainframes, bits 2, 8, and
9
are not used.
tt
Actually, the system steps on monitor function
44
(drop PP). This allows current I/O requests, including device checkpoints
in progress, to complete.
F-2
60435600 K
Need help?
Do you have a question about the CYBER 170 Series and is the answer not in the manual?