Simulating Tlb Parity Errors For Software Testing - IBM PPC440X5 CPU Core User Manual

Cpu core
Table of Contents

Advertisement

User's Manual
PPC440x5 CPU Core
2. MSR[ME] = 1, so the CPU vectors to the machine check handler (i.e takes the machine check interrupt)
and resets the MSR[ME] bit. Note that even though the parity error causes an asynchronous interrupt,
that interrupt is guaranteed to be taken before the tlbre instruction completes if the CCR0[PRE] (Parity
Recoverability Enable) is set, and so the target register (RT) of the tlbre will not be updated.
3. The Machine Check handler code includes a series of tlbre instructions to query the state of the TLB and
find the erroneous entry. When a tlbre encounters an erroneous entry and MSR[ME] = 0, the parity
exception still happens, setting the MCSR[MCS] and MCSR[TLBE] bits. Additionally, since MSR[ME] = 0,
MCSR[IMCE] is set, indicating that an imprecise machine check was detected. Finally, the instruction
completes, (since no interrupt is taken because MSR[ME} = 0), updating the target register with data from
the TLB, including the parity information.
tlbre causes an exception when it detects a parity error, but the icread and dcread instruc-
Note that the
tions do not. This inconsistency is explained because OS code commonly uses a sequence of
tlbre instructions to update the "changed" bit in the page table entries. (See section 5.10, "Page Reference
and Change Status Management.") Forcing the software to check the parity manually for each
a performance limitation. No such functional use exists for the
only in debugging contexts with no significant performance requirements.
As is the case for any machine check interrupt, after vectoring to the machine check handler, the MCSRR0
contains the value of the oldest "uncommitted" instruction in the pipeline at the time of the exception and
MCSRR1 contains the old (MSR) context. The interrupt handler is able to query Machine Check Status
Register (MCSR) to find out that it was called due to a TLB parity exception, and then use
to find the error in the TLB and restore it from a known good copy in main memory.
Note: A parity error on the TLB entry which maps the machine check exception handler code
prevents recovery. In effect, one of the 64 TLB entries is unprotected, in that the machine
cannot recover from an error in that entry. It is possible to add logic to get around this problem,
but the reduction in SER achieved by protecting 63 out of 64 TLB entries is sufficient. Further,
the software technique of simply dedicating a TLB entry to the page that contains the machine
check handler and periodically refreshing that entry from a known good copy can reduce the
probability that the entry will be used with a parity error to near zero.
As mentioned above, any
from the pipeline before it completes. Further, any instruction that causes a DTLB or ITLB refill which causes
a TLB parity error will be flushed before it completes.

5.11.2 Simulating TLB Parity Errors for Software Testing

Because parity errors occur in the TLB infrequently and unpredictably, it is desirable to provide users with a
way to simulate the effect of a TLB parity error so that interrupt handling software may be exercised. This is
exactly the purpose of the 4-bit CCR1[MMUPEI] field.
Usually, parity is calculated as the even parity for each set of bits to be protected, which the checking hard-
ware expects. This calculation is done as the TLB data is stored with a
the the CCR1[MMUPEI] bits are set, the calculated parity for the corresponding bits of the data being stored
are inverted and stored as odd parity. Then, when the data stored with odd parity is subsequently used to
refill the DTLB or ITLB, or by a
interrupt and exercise the interrupt handling software. The following pseudo-code is an example of how to
use the CCR1[MMUPEI] field to simulate a parity error on a TLB entry:
mtspr CCR1, Rx
isync
tlbwe Rs,Ra,0
tlbwe Rs,Ra,1
Page 156 of 589
tlbre or tlbsx instruction that causes a machine check interrupt will be flushed
tlbsx or tlbre instruction, it will cause a Parity exception type Machine Check
; Set some CCR1[MMUPEI] bits
; wait for the CCR1 context to update
; write some data to the TLB with bad parity
; write some data to the TLB with bad parity
icread and dcread instructions; they are used
tlbwe instruction. However, if any of
Preliminary
tlbsx and
tlbre would be
tlbre instructions
mmu.fm.
September 12, 2002

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents