Example 6–69. Final Assembly Code for FIR Filter With Redundant Load Elimination
and No Memory Hits
MVK
MVK
||
MVK
OUTLOOP:
LDH
||
ADD
||
ADD
||
MVK
||[A2]
SUB
LDH
||
LDH
||
ZERO
||
ZERO
LDH
||
LDH
LDH
||
LDH
LDH
||
LDH
||[B2]
SUB
LDH
||
LDH
LDH
||
LDH
MPY
||
MPY
||
LDH
||
LDH
[B2]
B
||
MPY
||
MPY
||
LDH
||
LDH
||[B2]
SUB
ADD
||
MPY
||
MPY
||
LDH
||
LDH
.S1
50,A2
.S1
62,A3
.S2
64,B10
.D1
*A4++,B5 ; x0 = x[j]
.L2X
A4,4,B1
.L1X
B4,2,A8
.S2
8,B2
.S1
A2,1,A2
.D2
*B1++[2],B0
.D1
*A4++[2],A0
.L1
A9
.L2
B9
.D1
*A8++[2],B6
.D2
*B4++[2],A1
.D1
*A4++[2],A5
.D2
*B1++[2],B5
.D2
*B4++[2],A7
.D1
*A8++[2],B8
.S2
B2,1,B2
.D2
*B1++[2],B0
.D1
*A4++[2],A0
.D1
*A8++[2],B6
.D2
*B4++[2],A1
.M1X
B5,A1,A0
.M2X
A0,B6,B6
.D1
*A4++[2],A5
.D2
*B1++[2],B5
.S1
LOOP
.M2
B0,B6,B7
.M1
A0,A1,A1
.D2
*B4++[2],A7
.D1
*A8++[2],B8
.S2
B2,1,B2
.L1
A0,A9,A9
.M2X
A5,B8,B8
.M1X
B0,A7,A5
.D2
*B1++[2],B0
.D1
*A4++[2],A0
; set up outer loop counter
; used to rst x pointer outloop
; used to rst h pointer outloop
; set up pointer to x[j+2]
; set up pointer to h[1]
; set up inner loop counter
; decrement outer loop counter
; x2 = x[j+i+2]
; x1 = x[j+i+1]
; zero out sum0
; zero out sum1
; h1 = h[i+1]
; h0 = h[i]
; x3 = x[j+i+3]
; x0 = x[j+i+4]
; h2 = h[i+2]
; h3 = h[i+3]
; decrement loop counter
;* x2 = x[j+i+2]
;* x1 = x[j+i+1]
;* h1 = h[i+1]
;* h0 = h[i]
; x0 * h0
; x1 * h1
;* x3 = x[j+i+3]
;* x0 = x[j+i+4]
; branch to loop
; x2 * h1
; x1 * h0
;* h2 = h[i+2]
;* h3 = h[i+3]
;* decrement loop counter
; sum0 += x0 * h0
; x3 * h3
; x2 * h2
;** x2 = x[j+i+2]
;** x1 = x[j+i+1]
Optimizing Assembly Code via Linear Assembly
Memory Banks
6-129
Need help?
Do you have a question about the TMS320C6000 and is the answer not in the manual?
Questions and answers