IBM System/370 Manual page 92

Hide thumbs Also See for System/370:
Table of Contents

Advertisement

Page of GC20-1730-0
Revised 7/14/70
By TNL GN20-2227
• OBR and SDR routines to record statistics for all temporary and
permanent I/O errors
• Environment recording, edit, and print program (EREP) to format
and print error log records
• I/O RMS routines, alternate path retry (APR), and dynamic device
reconfiguration (DDR) to provide additional recovery procedures
after channel or I/O device failures
• Advanced checkpoint/restart and warm start facilities to simplify
and speed up system restart procedures after a failure necessitates
a re-IPL
The following repair features are provided:
• Online Test Executive Program (OLTEP) and Online Tests (OLT's)
that execute under OS control and provide online diagnosis of I/O
device errors for most devices that attach to the Model 165
• Processor Logout Analysis program that operates under OS control
to analyze machine check error records in order to determine
suspected malfunctioning field-replaceable units
• System Test, Channel Test, CPU Test, and storage Test stand-alone
diagnostic programs to identify failing hardware units
• Microdiagnostics for customer engineer use to locate the field-
replaceable unit within a malfunctioning component
These aids are designed to enhance system availability.
In many
cases, the system can run in a degraded mode so that maintenance can
be deferred to scheduled maintenance periods.
When solid failures
do occur, their impact can be reduced by faster isolation and repair
of the malfunction than is possible currently.
50:10
RECOVERY FEATURES
Additional hardware that attempts correction of most hardware errors
without programming assistance has been included as part of the basic
Model 165 system.
The control program can be notified, via an
interrupt, of both intermittent and solid hardware errors so that error
recording and recovery procedures can take place.
AUTOMATIC CPU RETRY
Detected CPU hardware errors, except those that occur during execution
of certain instructions that have passed beyond a threshold point, can
be retried automatically by CPU retry hardware.
A mask bit in a control
register determines whether the CPU retry function is enabled or
disabled.
If enabled, retry occurs after instruction errors, after
failures that occur during interrupt time when status information is
being saved, after errors that occur during status saving for I/O
instructions, etc.
An I/O instruction, such as START I/O or TEST I/O,
can be retried automatically by the hardware without an intervening
I/O interrupt if the instruction has not proceeded beyond an established
threshold point.
CPU retry also occurs when an instruction error results from a
buffer malfunction, if the instruction is a retryable type.
The buffer
is bypassed while the instruction is retried so that processor storage
70

Advertisement

Table of Contents
loading

This manual is also suitable for:

165

Table of Contents