Memory Ecc Routing; Data Poisoning; Usage Of First-Error And Next-Error - Intel 460GX Software Developer’s Manual

Chipset system

page of 294

/ 294
Contents
Table of Contents
Bookmarks

Table of Contents

6.2

Memory ECC Routing

The ECC code used in DRAM is the same code as used in the Itanium processor, requiring 8 check

bits to cover 64 bits of data. On the system bus, this code detects and corrects all single-bit errors,

and detects double-bit errors.

The system designer has the option of wiring the boards such that the following is true:

•

Using x4 DRAM's, multiple errors within one chip are 100% corrected

•

Using x8 chips, all errors within a single chip are 100% detected

This is done by wiring the board so that each x4 DRAM has one bit in each of the 4 ECC words of

a half-line. Since a half-line is 256 bits and the ECC is on 64 bits, there are 4 ECC words per half-

line. For x8 chips, the bits are sliced across the 4 words, so that at most 2 bits from any one chip are

in one ECC word. The ECC used on the processor will detect all 4-bit nibble errors.

6.3

Data Poisoning

When data is received that is uncorrectable, it will be passed on to the next interface as poisoned.

The data may have come from memory or from the system bus with uncorrectable ECC errors. All

data passes through the data buffer in the SDC. As uncorrectable data is placed in the data buffer it

is marked that it was received as bad. When the data is read out of the data buffer and sent on, then

the parity or ECC generated will be deliberately forced bad. Data is checked on a 'chunk'

boundary, with a chunk being 64 bits of data.

Data to the system bus or to DRAM will have 2 bits of ECC corrupted for each failed chunk of

data. These are bits 0 and 1 of the ECC bits, or bits 63 and 71 if looking at the entire 72 bits of data/

ECC. Data passed to the private data bus will invert all the calculated parity bits associated with the

failing chunk, thus passing bad parity to the private data bus.

6.4

Usage of First-error and Next-error

The first instance of an error is latched in the first-error status register (FERR). The first error does

NOT set the bit in the next-error register (NERR). When an error is found, it is latched into the

FERR if the FERR has no other bit set. If any bit is already set, then the appropriate bit in NERR is

set.

Since the system needs to know if only one error has occurred or many, setting the FERR does not

set the NERR. If there is another error of any type, including a second occurrence of the first-error,

then the NERR is set. Software can read both FERR and NERR. If FERR is set but NERR is not,

then only one error occurred in the system. If both are set, then multiple errors have occurred.

For the first error, as much information as possible is captured. The data, address and command

information is captured if available. This allows isolation of errors and possible recovery.

In the case of 2 errors occurring in the same cycle, then 2 bits may be set in FERR. This should be

a rare case. The other exception is for FERR_SAC. If there is a single-bit correctable ECC error

from DRAM, then bit SCME will be set. This bit will not block other bits in FERR_SAC from

being set. This allows software to poll periodically looking for single bit errors while not

preventing other errors from being logged. Other than these two conditions, there should never be

more than one bit set in any FERR.

Intel® 460GX Chipset Software Developer's Manual

Data Integrity and Error Handling

6-3

Table of Contents

Show Quick Links

Quick Links:
Diagram of a Typical Intel® 460Gx...

Hide quick links:

Table of Contents

Memory Ecc Routing; Data Poisoning; Usage Of First-Error And Next-Error - Intel 460GX Software Developer’s Manual

Memory ECC Routing

Data Poisoning

Usage of First-error and Next-error

Hide quick links:

Related Manuals for Intel 460GX

Related Content for Intel 460GX

Table of Contents