Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 203

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

Table 5-2.
Cycle
3
...
100
...
199
200
201
The executions of br.wtop in the first two cycles of the prolog do not correspond to any
of the source iterations. Their purpose is simply to continue the kernel loop until the
first valid loop condition can be produced.
one. For this programming scheme, the branch predicate of the br.wtop is always a
one during the last speculative stage of the first source iteration. During all the prior
stages, the branch predicate is zero. If the branch predicate is zero, the br.wtop
continues the kernel loop only if EC is greater than one. It also decrements EC. Thus EC
must be initialized to (# epilog stages + # speculative pipeline stages).
example, this is 0 + 2 = 2.
In cycle 201, the compare for the 200
final source iteration, the result of the compare is a zero and p17 is unmodified. The
zero that was rotated into p17 from p16 causes the br.wtop to fall through to the loop
exit. EC is decremented and the registers are rotated one last time.
In the above example, there are no epilog stages. As soon as the branch predicate
becomes zero, the kernel loop is exited.
5.5.2
Loops with Predicated Instructions
Instructions that already have predicates in the source loop are not assigned stage
predicates. They continue to be controlled by compare instructions in the loop body. For
example, the following loop contains predicated instructions:
L1:
(p1)
(p2)
1:192
wtop Loop Trace
Port/Instructions
M
I
I
ld4.s
cmp
chk
...
...
ld4.s
cmp
chk
...
...
ld4.s
cmp
chk
ld4.s
cmp
chk
ld4.s
cmp
chk
ldfs
f4 = [r5],4
ldfs
f9 = [r8],4;;
fcmp.ge.unc p1,p2 = f4,f9;;
stfs
[r9] = f4, 4
stfs
[r9] = f9, 4
br.cloopL1 ;;
M
B
p16
st4
br.wtop
...
...
...
st4
br.wtop
...
...
...
st4
br.wtop
st4
br.wtop
st4
br.wtop
In cycle one, the branch predicate p17 is
th
source iteration is executed.
Volume 1, Part 2: Software Pipelining and Loop Support
State before br.wtop
p17
p18
0
1
1
...
...
0
1
1
...
...
0
1
1
0
1
1
0
0
1
0
0
0
In the above
Since this is the
EC
1
...
1
...
1
1
1
0

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents