Example 6–78. Final Assembly Code for FIR Filter (Continued)
ADD
||
ADD
||
MPY
||
MPYLH
||[A2]
SUB
ADD
||[!A2]
SHR
||
MPY
||
MPYH
||[A2]
ADD
||
LDW
||
LDW
;Branch occurs here
[!A2]
SHR
[!A2]
STH
||[!A2]
STH
6.14.9 Comparing Performance
Table 6–28. Comparison of FIR Filter Code
Code Example
Example 6–61 FIR with redundant load elimination
Example 6–69 FIR with redundant load elimination and no memory
hits
Example 6–71 FIR with redundant load elimination and no memory
hits with outer loop software-pipelined
Example 6–74 FIR with redundant load elimination and no memory
hits with outer loop conditionally executed with inner
loop
.L2X
A9,B8,B11
.L1X
B11,A12,A12
.M1
A8,A10,A7
.M2
B7,B9,B13
.S1
A2,1,A2
.L1X
B13,A12,A10
.S2
B11,15,B11
.M2
B7,B9,B9
.M1
A8,A10,A10
.L2
B4,B11,B4
.D1
*A4++[2],B9
.D2
*B1++[2],A10
.S1
A10,15,A12
.D2
B11,*B6++[2]
.D1
A12,*A6++[2]
The cycle count of this code is 1612: 50 (8
to the outer loop has been completely eliminated.
Outer Loop Conditionally Executed With Inner Loop
; sum1 += p17
; sum0 += p06
;* p00 = h[i+0]*x[j+i+0]
;* p12 = h[i+2]*x[j+i+3]
;* dec store lp cntr
; sum0 += p07
;* (Bsum1 >> 15)
;* p02 = h[i+2]*x[j+i+2]
;* p01 = h[i+1]*x[j+i+1]
;* sum1(p10) = p10 + sum1
;** x[j+i+2] & x[j+i+3]
;** x[j+i+0] & x[j+i+1]
; (Asum0 >> 15)
; y[j+1] = (Bsum1 >> 15)
; y[j] = (Asum0 >> 15)
Cycles
50 (16
50 (8
50 (7
50 (8
Optimizing Assembly Code via Linear Assembly
4 + 0) + 12. The overhead due
Cycle Count
2 + 9 + 6) + 2
4 + 10 + 6) + 2
4 + 6 + 6) + 6
4 + 0) + 12
2352
2402
2006
1612
6-149
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers