Writing Parallel Code
6.3.2
Translating C Code to Linear Assembly
6.3.2.1 Fixed-Point Dot Product
Example 6–7. List of Assembly Instructions for Fixed-Point Dot Product
LDH
LDH
MPY
ADD
SUB
[A1]
B
6.3.2.2 Floating-Point Dot Product
Example 6–8. List of Assembly Instructions for Floating-Point Dot Product
LDW
LDW
†
MPYSP
†
ADDSP
SUB
[A1]
B
† ADDSP and MPYSP are 'C67x (floating-point) instructions only.
6-10
The first step in optimizing your code is to translate the C code to linear assem-
bly.
Example 6–7 shows the linear assembly instructions used for the inner loop
of the fixed-point dot product C code.
.D1
*A4++,A2
.D1
*A3++,A5
.M1
A2,A5,A6
.L1
A6,A7,A7
.S1
A1,1,A1
.S2
LOOP
The load halfword (LDH) instructions increment through the a and b arrays.
Each LDH does a postincrement on the pointer. Each iteration of these instruc-
tions sets the pointer to the next halfword (16 bits) in the array. The ADD in-
struction accumulates the total of the results from the multiply (MPY) instruc-
tion. The subtract (SUB) instruction decrements the loop counter.
An additional instruction is included to execute the branch back to the top of
the loop. The branch (B) instruction is conditional on the loop counter, A1, and
executes only until A1 is 0.
Example 6–8 shows the linear assembly instructions used for the inner loop
of the floating-point dot product C code.
.D1
*A4++,A2
.D2
*A3++,A5
.M1
A2,A5,A6
.L1
A6,A7,A7
.S1
A1,1,A1
.S2
LOOP
The load word (LDW) instructions increment through the a and b arrays. Each
LDW does a postincrement on the pointer. Each iteration of these instructions
sets the pointer to the next word (32 bits) in the array. The ADDSP instruction
; load ai from memory
; load bi from memory
; ai * bi
; sum += (ai * bi)
; decrement loop counter
; branch to loop
; load ai from memory
; load bi from memory
; ai * bi
; sum += (ai * bi)
; decrement loop counter
; branch to loop
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers