6.4.5.2 Floating-Point Dot Product
Example 6–20. Assembly Code for Floating-Point Dot Product With LDDW
(Before Software Pipelining)
MVK
.S1
||
ZERO
.L1
||
ZERO
.L2
LOOP:
LDDW
.D1
||
LDDW
.D2
SUB
.S1
NOP
2
[A1]
B
.S1
MPYSP .M1X
||
MPYSP .M2X
NOP
3
ADDSP .L1
||
ADDSP .L2
; Branch occurs here
NOP
3
ADDSP .L1X
NOP
3
Using Word Access for Short Data and Doubleword Access for Floating-Point Data
Example 6–20 uses LDDW instructions instead of LDW instructions.
50,A1
; set up loop counter
A7
; zero out sum0 accumulator
B7
; zero out sum1 accumulator
*A4++,A2
; load ai & ai+1 from memory
*B4++,B2
; load bi & bi+1 from memory
A1,1,A1
; decrement loop counter
LOOP
; branch to loop
A2,B2,A6
; ai * bi
A3,B3,B6
; ai+1 * bi+1
A6,A7,A7
; sum0 += (ai * bi)
B6,B7,B7
; sum1 += (ai+1 * bi+1)
A7,B7,A4
; sum = sum0 + sum1
The code in Example 6–20 includes the following optimizations:
The setup code for the loop is included to initialize the array pointers and
the loop counter and to clear the accumulators. The setup code assumes
that A4 and B4 have been initialized to point to arrays a and b , respectively.
The MVK instruction initializes the loop counter.
The two ZERO instructions, which execute in parallel, initialize the even
and odd accumulators (sum0 and sum1) to 0.
The third ADDSP instruction adds the even and odd accumulators.
Optimizing Assembly Code via Linear Assembly
6-27
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers