Branch Instructions; Loops And Software Pipelining; Rotating Registers - Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

• Using predication to reduce the number of branches in the code. This improves
instruction fetching because there are fewer control flow changes, decreases the
number of branch mispredicts since there are fewer branches, and it increases the
branch prediction hit rates since there is less competition for prediction resources.
• Providing software hints for branches to improve hardware use of prediction and
prefetching resources.
• Supplying explicit support for software pipelining of loops and exit prediction of
counted loops.
2.7.1

Branch Instructions

Branching in the Itanium architecture is largely expressed the same way as on other
microprocessors. The major difference is that branch triggers are controlled by
predicates rather than conditions encoded in branch instructions. The architecture also
provides a rich set of hints to control branch prediction strategy, prefetching, and
specific branch types like loops, exits, and branches associated with software pipelining.
Targets for indirect branches are placed in branch registers prior to branch instructions.
2.7.2

Loops and Software Pipelining

Compilers sometimes try to improve the performance of loops by using unrolling.
However, unrolling is not effective on all loops for the following reasons:
• Unrolling may not fully exploit the parallelism available.
• Unrolling is tailored for a statically defined number of loop iterations.
• Unrolling can increase code size.
To maintain the advantages of loop unrolling while overcoming these limitations, the
Itanium architecture provides architectural support for software pipelining. Software
pipelining enables the compiler to interleave the execution of several loop iterations
without having to unroll a loop. Software pipelining is performed using:
• Loop-branch instructions.
and
LC
• Rotating registers and loop stage predicates.
• Branch hints that can assign a special prediction mechanism to important branches.
In addition to software pipelined while and counted loops, the architecture provides
particular support for simple counted loops using the
branch instruction uses the 64-bit Loop Count (
qualifying predicate to determine the branch exit condition.
For a complete discussion of software pipelining support, see
Pipelining and Loop Support."
2.7.3

Rotating Registers

Rotating registers enable succinct implementation of software pipelining with
predication.
the special loop branches is executed. Thus, after one rotation, the content of register
will be found in register
Volume 1, Part 2: Introduction to Programming for the Intel
application registers.
EC
Rotating registers are rotated by one register position each time one of
and the value of the highest numbered rotating register
X+1
®
br.cloop
) application register rather than a
LC
®
Itanium
Architecture
instruction. The
cloop
Chapter 5, "Software
1:145
X

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents