Refining C/C++ Code
Example 3–20. Using _nassert() Intrinsic to Generate Word Accesses for FIR Filter
void fir (const short x[restrict], const short h[restrict], short y[restrict]
int n, int m, int s)
{
int i, j;
long y0;
long round = 1L << (s - 1);
_nassert(((int)x & 0x3) == 0);
_nassert(((int)h & 0x3) == 0);
_nassert(((int)y & 0x3) == 0);
for (j = 0; j < m; j++)
{
y0 = round;
#pragma MUST_ITERATE (40, 40);
for (i = 0; i < n; i++)
y0 += x[i + j] * h[i];
y[j] = (int)(y0 >> s);
}
}
Example 3–21. Compiler Output From Example 3–20
L3:
; PIPED LOOP KERNEL
[!B0]
ADD
||
MPY
||
MPYHL
|| [ A1]
B
||
LDH
||
LDH
[!B0]
ADD
||
MPY
||
LDW
||
LDH
[ B0]
SUB
|| [!B0]
ADD
|| [!B0]
ADD
||
MPYHL
|| [ A1]
SUB
||
LDW
||
LDH
3-36
As you can see from Example 3–20, the optimization done by the compiler is
not as optimal as the code produced in Example 3–13, but it is more optimal
than the code in Example 3–12.
.L1
A9,A7:A6,A7:A6
.M2X
A3,B3,B2
.M1X
B3,A0,A0
.S2
L3
.D2T2
*++B9(8),B3
.D1T1
*+A8(4),A3
.L2
B3,B5:B4,B5:B4
.M1X
A0,B1,A9
.D2T2
*+B8(4),B3
.D1T1
*+A8(6),A0
.S2
B0,1,B0
.L2
B2,B7:B6,B7:B6
.L1
A0,A5:A4,A5:A4
.M2
B1,B3,B3
.S1
A1,1,A1
.D2T2
*++B8(8),B1
.D1T1
*++A8(8),A0
; |21|
; |21|
; |21|
; @|21|
; @@|21|
; @@|21|
; |21|
; @|21|
; @@|21|
; @@|21|
;
; |21|
; |21|
; @|21|
; @@|21|
; @@@|21|
; @@@|21|
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers