Hardware Loops
Loop Unrolling
Typical DSP algorithms are coded for speed rather than for small code
size. Especially when fetching data from circular buffers, loops are often
unrolled in order to pass only N-1 times. The initial data fetch is executed
before the loop is entered. Similarly, the final calculations are done after
the loop terminates, for example:
#define N 1024
global_setup:
I0.H = 0xFF80; I0.L = 0x0000; B0 = I0; L0 = N*2 (Z);
I1.H = 0xFF90; I1.L = 0x0000; B1 = I1; L1 = N*2 (Z);
P5 = N-1 (Z);
algorithm:
A0 = 0 || R0.H = W[I0++] || R1.L = W[I1++];
LSETUP (lp,lp) LC0 = P5;
lp:
A0+= R0.H * R1.L || R0.H = W[I0++] || R1.L = W[I1++];
A0+= R0.H * R1.L;
This technique has the advantage that data is fetched exactly N times and
the I-Registers have their initial value after processing. The "algorithm"
sequence can be executed multiple times without any need to initialize
DAG-Registers again.
4-26
ADSP-BF53x/BF56x Blackfin Processor Programming Reference
Need help?
Do you have a question about the ADSP-BF53x Blackfin and is the answer not in the manual?
Questions and answers