IBM 4381 Manual page 109

Table of Contents

Advertisement

As data enters and leaves processor storage, ECC logic performs validity checking
on each doubleword. When a doubleword (72 bits) is fetched from processor
storage, the eight-bit ECC code is checked to validate the 64 data bits.
If
the data
. ....,.,,.
is correct, the appropriate parity bit for each of the eight data bytes is generated
and the doubleword is reformatted so that each eight data bits are followed by a
parity bit.
If
a single-bit error is detected, the identified data bit in error is
corrected automatically by ECC logic with no additional fetch time.
When a doubleword is to be placed in processor storage by a program, the eight
parity bits are removed and the eight-bit ECC code is generated and appended to
the 64 data bits. The 72 bits are then stored with the ECC bits.
If
a double- or multiple-bit error occurs during instruction execution and the
instruction is retryable, it is retried one time.
If
the double-bit or multiple error
does not recur, processing continues. Otherwise, for a double-bit error, correction
may be attempted depending on the type of double-bit error.
In a dynamic storage, alpha particles (defined as radiation from the packaging
materials and solder used in processor storage) can cause the state of a bit position
to change. That is, alpha particles can remove the charge stored in a capacitor that
represents a bit position. A state change causes a single-bit error (an intermittent
error) in a doubleword that has no solid single-bit error. Such a single-bit error can
be corrected by ECC logic as usual. However, if a doubleword has a solid
single-bit error, a bit change caused by an alpha particle results in a double-bit
error (one solid error and one intermittent error), which is uncorrectable by ECC
logic. Double-bit error correction is implemented primarily to correct double-bit
errors consisting of one alpha particle caused intermittent error and one solid error.
The following procedures are performed to handle double-bit errors that are
detected in a doubleword above or below the ACB in processor storage (that is, in
the user-addressable or auxiliary storage area):
For one solid and one intermittent error, the syndrome bits are used to correct
the intermittent
c~rror
in the doubleword in processor storage. The operation is
retried and ECC logic corrects the single-bit solid error. A system recovery
machine check interruption is generated with storage error corrected indicated
in the machine check code (bit 17 is turned on).
For two solid errors, the two error bits are corrected in the doubleword in the
buffer. A machine check interruption is generated with storage degradation
indicated in the machine check code (bit 19 is turned on). The operating
system should unload the data with the error to a different location in
processor storage to avoid repetition of a double-bit error when the
doubleword is fetched again.
For two intermittent errors in the program addressable area, a machine check
interruption is generated with storage error uncorrected specified in the
machine check code (bit 16 is turned on). For two intermittent errors in the
auxiliary storage area, a timing facilities damage machine check interruption is
generated if the double-bit error occurred in the CPU timer or clock
comparator area of auxiliary storage.
If
not, for System/370 mode, a system
damage machine check interruption is generated. For System/370-XA mode,
system damage is reported unless the error is located in a doubleword that
contains channel or subchannel data, in which case channel subsystem damage
.,,,.,
is reported.
100
A Guide to the IBM 43 81 Processor

Advertisement

Table of Contents
loading

Table of Contents