Register Stack - Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

if (p5) r1 = r2 + r3
In this example p5 is the controlling predicate that decides whether or not the
instruction executes and updates state. If the predicate value is true, then the
instruction updates state. Otherwise it generally behaves like a nop. Predicates are
assigned values by compare instructions.
Predicated execution avoids branches, and simplifies compiler optimizations by
converting a control dependency to a data dependency. Consider the original code:
if (a>b) c = c + 1
else d = d * e + f
The branch at (a>b) can be avoided by converting the code above to the predicated
code:
pT, pF = compare(a>b)
if (pT) c = c + 1
if (pF) d = d * e + f
The predicate pT is set to 1 if the condition evaluates to true, and to 0 if the condition
evaluates to false. The predicate pF is the complement of pT. The control dependency of
the instructions c = c + 1 and d = d * e + f on the branch with the condition (a>b)
is now converted into a data dependency on compare(a>b) through predicates pT and
pF (the branch is eliminated). An added benefit is that the compiler can schedule the
instructions under pT and pF to execute in parallel. It is also worth noting that there are
several different types of compare instructions that write predicates in different
manners including unconditional compares and parallel compares.
2.7

Register Stack

The Itanium architecture avoids the unnecessary spilling and filling of registers at
procedure call and return interfaces through compiler-controlled renaming. At a call
site, a new frame of registers is available to the called procedure without the need for
register spill and fill (either by the caller or by the callee). Register access occurs by
renaming the virtual register identifiers in the instructions through a base register into
the physical registers. The callee can freely use available registers without having to
spill and eventually restore the caller's registers. The callee executes an alloc
instruction specifying the number of registers it expects to use in order to ensure that
enough registers are available. If sufficient registers are not available (stack overflow),
the alloc stalls the processor and spills the caller's registers until the requested
number of registers are available.
At the return site, the base register is restored to the value that the caller was using to
access registers prior to the call. Some of the caller's registers may have been spilled
by the hardware and not yet restored. In this case (stack underflow), the return stalls
the processor until the processor has restored an appropriate number of the caller's
registers. The hardware can exploit the explicit register stack frame information to spill
and fill registers from the register stack to memory at the best opportunity
(independent of the calling and called procedures).
1:18
Volume 1, Part 1: Introduction to the Intel
®
®
Itanium
Architecture

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents