Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 186

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

The Itanium architecture allows multiple instructions to target the same register in the
same clock provided that only one of the instructions writing the target register is
predicated true in that clock. Similar capabilities exist for writing predicate registers, as
discussed in
4.3.3.2
Reducing Register Usage
In some instances it is possible to use the same register for two separate computations
in the presence of predication. This technique is similar to the technique for allowing
multiple writers to store a value into the same register, although it is a register
allocation optimization rather than a critical path issue.
After if-conversion, it is particularly common for sequences of instructions to be
predicated with complementary predicates. The contrived sequence below shows
instructions predicated by p1 and p2, which are known by the compiler to be
complementary:
(p1)
(p2)
(p1)
(p2)
(p1)
(p2)
(p1)
(p2)
Assuming registers r1, r5, r7, and r9 are used for compiler temporaries, each of which
is live only until its next use, the preceding code segment can be rewritten as:
(p1)
(p2)
(p1)
(p2)
(p1)
(p2)
(p1)
(p2)
The new sequence uses two fewer registers. With the 128 registers defined in the
architecture, this may not seem essential, but reducing register use can still reduce
program and register stack engine spills and fills that can be common in codes with
high instruction-level parallelism.
4.3.4
Improving Instruction Stream Fetching
Instructions flow through the pipeline most efficiently when they are executed in large
blocks with no taken branches. Whenever the instruction pointer needs to be changed,
the hardware may have to insert bubbles into the pipeline either while the target
prediction is taking place or because the target address is not computed until later in
the pipeline.
Volume 1, Part 2: Predication, Control Flow, and Instruction Stream
Section
4.3.1.
add
r1=r2,r3
sub
r5=r4,r56
ld8
r7=[r2]
ld8
r9=[r6];;
a use of r1
a use of r5
a use of r7
a use of r9
add
r1=r2,r3
sub
r1=r4,r56
// Reuse r1
ld8
r7=[r2]
ld8
r7=[r6];;
// Reuse r7
a use of r1
a use of r1
a use of r7
a use of r7
1:175

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 and is the answer not in the manual?

Questions and answers

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents