Software Pipelining
Example 6–31. Assembly Code for Floating-Point Dot Product (Software Pipelined With
Removal of Prolog and Epilog) (Continued)
[A1]
B
||[A1]
SUB
[A1]
B
||[A1]
SUB
LOOP:
LDDW
||
LDDW
||
MPYSP
||
MPYSP
||
ADDSP
||
ADDSP
||[A1]
B
||[A1]
SUB
; Branch occurs here
ADDSP
ADDSP
ADDSP
ADDSP
NOP
ADDSP
NOP
ADDSP
NOP
ADDSP
NOP
6-54
.S2
LOOP
.S1
A1,1,A1
.S2
LOOP
.S1
A1,1,A1
.D1
A4++,A7:A6
.D2
B4++,B7:B6
.M1X
A6,B6,A5
.M2X
A7,B7,B5
.L1
A5,A8,A8
.L2
B5,B8,B8
.S2
LOOP
.S1
A1,1,A1
.L1X
A8,B8,A0
.L2X
A8,B8,B0
.L1X
A8,B8,A0
.L2X
A8,B8,B0
.L1X
A0,B0,A5
.L2X
A0,B0,B5
3
.L1X
A5,B5,A4
3
;*** branch to loop
;**** decrement loop counter
;**** branch to loop
;***** decrement loop counter
;********* load ai & ai + 1 from memory
;********* load bi & bi + 1 from memory
;**** pi = a0
b0
;**** pi1 = a1
b1
; sum0 += (ai
bi)
; sum1 += (ai+1
bi+1)
;***** branch to loop
;****** decrement loop counter
; sum(0) = sum0(0) + sum1(0)
; sum(1) = sum0(1) + sum1(1)
; sum(2) = sum0(2) + sum1(2)
; sum(3) = sum0(3) + sum1(3)
; wait for B0
; sum(01) = sum(0) + sum(1)
; wait for next B0
; sum(23) = sum(2) + sum(3)
; sum = sum(01) + sum(23)
;
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers