Dependency Graph Of Floating-Point Dot Product - Texas Instruments TMS320C6000 Programmer's Manual

Hide thumbs Also See for TMS320C6000:
Table of Contents

Advertisement

6.3.4.2 Floating-Point Dot Product
Figure 6–2. Dependency Graph of Floating-Point Dot Product
Instruction
mnemonic
Number of cycles
required to complete
an instruction
The dependency graph for this dot product algorithm has two separate parts
because the decrement of the loop counter and the branch do not read or write
any variables from the other part.
The SUB instruction writes to the loop counter, cntr. The output of the SUB
instruction feeds back and creates a loop carry path.
The branch (B) instruction is a child of the loop counter.
Similarly, Figure 6–2 shows the dependency graph for the floating-point dot
product assembly instructions shown in Example 6–8 and their corresponding
register allocations.
LDW
ai
Variable
(A2)
being
written
5
.M1
4
The two LDW instructions, which write the values of ai and bi, are parents
of the MPYSP instruction. It takes five cycles for the parent (LDW) instruc-
tion to complete. Therefore, if LDW is scheduled on cycle i, then its child
(MPYSP) cannot be scheduled until cycle i + 5.
The MPYSP instruction, which writes the product pi, is the parent of the
ADDSP instruction. The MPYSP instruction takes four cycles to complete.
The ADDSP instruction adds pi (the result of the MPYSP) to sum. The
output of the ADDSP instruction feeds back to become an input on the next
iteration and, thus, creates a loop carry path. (See section 6.7 on page
6-77 for more information on loop carry paths.)
LDW
bi
.D1
.D1
(A5)
MPYSP
5
Register
allocation
pi
(A6)
4
ADDSP
sum
.L1
(A7)
Optimizing Assembly Code via Linear Assembly
Writing Parallel Code
Functional
unit
SUB
1
cntr
.S1
(A1)
1
B
LOOP
.S1
6-13

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents