IBM System/370 145 Manual page 196

Hide thumbs Also See for System/370 145:
Table of Contents

Advertisement

• Error recovery procedures (ERP) to retry failing I/O device and
channel operations (OS and DOS)
• OBR and SDR routines (OS) and RMSR (005) tCJ record statistics for
I/O errors
• Environment recording. edit, and print p.rogram (EREP) for OS and DOS
to format and print error log records
• I/O RMS routines (OS)--alternate path retry (APR) and dynamic device
reconfiguration (DDR)--to provide additional recovery procedures
after channel or I/O device failures
• Checkpoint/restart (OS and DOS) and warm start facilities (OS) to
simplify and speed up system restart procedures after a failure
necessitates are-IPL
The following repair features are provided:
• Online Test Executive program (OLTEP) and Online Tests (OLT"s) that
execute under operating system control (OS and DOS) and provide
online diagnosis of I/O device errors for most devices that attach
to the Model 145
• Microdiagnostics·to locate the malfunctioning field-replaceable unit
These aids are designed to enhance system availability,.
In many
cases, the system can run in a degraded mode so that maintenance can be
deferred to scheduled maintenance periods.
When solid failures do
occur, their impact can be reduced by faster isolation and repair of the
malfunction than is currently possible.
The programmed recovery features discussed in this section are those
for OS MFT and MFT and DOS Versions 3 and 4.
The programmed recovery
provided by the virtual storage operating systems is discussed in the
optional programming systems supplements.
50:10
RECOVERY FEATURES
Additional hardware, which attempts correction of most hardware
errors
~ithout
programming assistance, has been included as part of the
basic Model 145 system.
The control program can
be
notified, via an
interruption, of both intermittent and solid hardware errors so that
error recording and recovery procedures can take place.
AUTOMATIC MICROINSTRUCTION RETRY
Detected CPU hardware errors can be retried automatically by
microinstruction retry hardware.
Retry can take place after an error
occurs in any instruction, after failures that occur during interruption
time when status information is being saved. after errors that occur
during status saving for I/O instructions, etc.
Even I/O instructions
are retried automatically by the hardware without an intervening I/O
interruption.
(Either a machine check or an I/O interruption is taken.
if the CPU is enabled for these interruptions, depending on the success
of the I/O instruction retry and the point in the operation at which the
error occurred.>
Microinstruction retry is accomplished by additional microprogram
routines and hardware included in the Model 145.
The failing CPU
operation is retried by the microprogram up to eight times before it is
determined that the error is uncorrectable.
checkpoints are taken and
data is saved in backup locations during the operation of instructions
186
A Guide to the IBM System/370 Model 145

Advertisement

Table of Contents
loading

Table of Contents