Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 171

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

A disadvantage of post-increment loads is that they create new dependencies between
post-increment loads and the operations that use the post-increment values. In some
cases, the compiler may wish to separate post-increment loads into their component
instructions to improve the overall schedule. Alternatively, the compiler could wait until
after instruction scheduling and then opportunistically find places where post-increment
loads could be substituted for separate load and add instructions.
3.5.5
Loop Optimization
In cyclic code, speculation can extend the use of classical loop optimizations like
invariant code motion. Examine this pseudo-code:
The variables a and b are probably loop invariant; however, the compiler must assume
the stores to *ptr will overwrite the values of a and b unless analysis can guarantee
that this can never happen. The use of advanced loads and checks allows code that is
likely to be invariant to be removed from a loop, even when a pointer cannot be
disambiguated:
L1:
L2:
At the end of the module:
recover1:
recover2:
Using speculation in this loop hides the latency of the calculation of c whenever the
speculated code is successful.
Since checks have both a clear (clr) and no clear (nc) form, the programmer must
decide which to use. This example shows that when checks are moved out of loops, the
no clear version should be used. This is because the clear (clr) version will cause the
corresponding ALAT entry to be removed (which would cause the next check to that
register to fail).
1:160
while (cond) {
c = a + b; // Probably loop invariant
*ptr++ = c;// May point to a or b
}
ld4.a
r1 = [&a]
ld4.a
r2 = [&b]
add
r3 = r1,r2 // Move computation out of loop
while (cond) {
chk.a.nc r1, recover1
chk.a.nc r2, recover2
*p++ = r3
}
// Recover from failed load of a
ld4.a
r1 = [&a]
add
r3 = r1, r2
br.sptk L1
// Unconditional branch
// Recover from failed load of b
ld4.a
r2 = [&b]
add
r3 = r1, r2
br.sptk L2
// Unconditional branch
Volume 1, Part 2: Memory Reference

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents