Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 177

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

The process of predicating instructions in conditional blocks and removing branches is
referred to as if-conversion. Once if-conversion has been performed, instructions can
be scheduled more freely because there are fewer branches to limit code motion, and
there are fewer branches competing for issue slots.
In addition to removing branches, this transformation will make dynamic instruction
fetching more efficient since there are fewer possibilities for control flow changes.
Under more complex circumstances, several branches can be removed. The following C
code sequence:
can be rewritten in Itanium architecture-based assembly code without branches as:
(p1)
(p2)
Since instructions from opposite sides of the conditional are predicated with
complementary predicates they are guaranteed not to conflict, hence the compiler has
more freedom when scheduling to make the best use of hardware resources. The
compiler could also try to schedule these statements with earlier or later code since
several branches and labels have been removed as part of if-conversion.
Since the branches have been removed, no branch misprediction is possible and there
will be no pipeline bubbles due to taken branches. Such effects are significant in many
large applications, and these transformations can greatly reduce branch-induced stalls
or flushes in the pipeline.
Thus, comparing the cost of the code above with the non-predicated version above
shows that:
• Non-predicated code consumes: 2 cycles + (30% * 10 cycles) = 5 cycles.
• Predicated code consumes: 2 cycles.
In this case, predication saves an average of three cycles.
4.2.3.2
Off-path Predication
If a compiler has dynamic profile information, it is possible to form an instruction
schedule based on the control flow path that is most likely to execute – this path is
called the main trace. In some cases, execution paths not on the main trace are still
executed frequently, and thus it may be beneficial to use predication to minimize their
critical paths as well.
The main trace of a flow graph is highlighted in
not on the main trace, suppose they are executed a significant number of times.
1:166
if (r1)
r2 = r3 + r4;
else
r7 = r6 - r5;
cmp.ne p1,p2 = r1,0;;
add
r2 = r3,r4
sub
r7 = r6,r5
Figure
4-1. Although blocks A and B are
Volume 1, Part 2: Predication, Control Flow, and Instruction Stream

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents