Vector Sum - Texas Instruments TMS320C6000 Programmer's Manual

Hide thumbs Also See for TMS320C6000:
Table of Contents

Advertisement

Refining C/C++ Code
Example 3–31. Vector Sum
void vecsum(short *restrict a, const short *restrict b, const short *restrict
c, int n)
{
int i;
#pragma MUST_ITERATE (20, , 2);
for (i = 0; i < n; i++) a[i] = b[i] + c[i];
}
<compiler output for above code>
L2:
; PIPED LOOP KERNEL
ADD
|| [ B0]
B
||
LDH
||
LDH
[!A1]
STH
||
ADD
||
LDH
[ A1]
SUB
|| [!A1]
STH
|| [ B0]
SUB
||
LDH
3-48
Example 3–31 shows how the compiler can perform simple loop unrolling of
replicating the loop body. The MUST_ITERATE pragma tells the compiler that
the loop will execute an even number of 20 or more times. This compiler will
unroll the loop once to take advantage of the performance gain that results
from the unrolling.
.L1X
B7,A3,A3
.S1
L2
.D1T1
*++A4(4),A3
.D2T2
*++B4(4),B7
.D1T1
A3,*++A0(4)
.L2X
B6,A5,B6
.D2T2
*+B4(2),B6
.L1
A1,1,A1
.D2T2
B6,*++B5(4)
.L2
B0,1,B0
.D1T1
*+A4(2),A5
Note: When the interrupt threshold option is used, unrolling can be used
to regain lost performance. Refer to section 7.4.4 Getting the Most Perfor-
mance Out of Interruptible Code .
If the compiler does not automatically unroll the loop, you can suggest that the
compiler unroll the loop by using the UNROLL pragma. See the
TMS320C6000 Optimizing C/C++ Compiler User's Guide for more informa-
tion.
; |5|
; @|5|
; @@|5|
; @@|5|
; |5|
; |5|
; @@|5|
;
; |5|
; @@|5|
; @@|5|

Hide quick links:

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the TMS320C6000 and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Table of Contents