Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 164

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

If no matching entry is found, the speculative results need to be recomputed:
• Use a chk.a if a load and some of its uses are speculated. The chk.a jumps to
compiler-generated recovery code to re-execute the load and dependent
instructions.
• Use a ld.c if no uses of the load are speculated. The ld.c reissues the load.
Entries are removed from the ALAT due to:
• Stores that write to addresses overlapping with ALAT entries.
• Other advanced loads that target the same physical registers as ALAT entries.
• Implementation-defined hardware or operating system conditions needed to
maintain correctness.
• Limitations of the capacity, associativity, and matching algorithm used for a given
implementation of the ALAT.
3.4.2.1
Advanced Load Example
Advanced loads can reduce the critical path of a sequence of instructions. In the code
below, a load and store may access conflicting memory addresses:
On the generic machine model, the code above would execute in four cycles, but it can
be rewritten using an advanced load and check:
The original load has been turned into a check load, and an advanced load has been
scheduled above the ambiguous store. If the speculation succeeds, the execution time
of the remaining non-speculative code is reduced because the latency of the advanced
load is hidden.
3.4.2.2
Recovery Code Example
Consider again the non-speculative code from the last section:
Volume 1, Part 2: Memory Reference
st8
[r4]=r12
ld8
r6=[r8];;
add
r5=r6,r7;;
st8
[r18]=r5
ld8.a
r6=[r8]
// Other instructions
st8
[r4]=r12
ld8.c
r6=[r8]
add
r5=r6,r7;;
st8
[r18]=r5
st8
[r4]=r12
ld8
r6=[r8];;
add
r5=r6,r7;;
st8
[r18]=r5
// Cycle 0: ambiguous store
// Cycle 0: load to advance
// Cycle 2
// Cycle 3
// Cycle -2 or earlier
// Cycle 0: ambiguous store
// Cycle 0: check load
// Cycle 0
// Cycle 1
// Cycle 0: ambiguous store
// Cycle 0: load to advance
// Cycle 2
// Cycle 3
1:153

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents