Fir_Type2—Original Form - Texas Instruments TMS320C6000 Programmer's Manual

Hide thumbs Also See for TMS320C6000:
Table of Contents

Advertisement

Refining C/C++ Code
Example 3–29. FIR_Type2—Original Form
void fir2(const short input[restrict], const short coefs[restrict], short
out[restrict])
{
int i, j;
int sum = 0;
for (i = 0; i < 40; i++)
{
for (j = 0; j < 16; j++)
sum += coefs[j] * input[i + 15 – j];
out[i] = (sum >> 15);
}
}
3-46
Software pipelining is performed by the compiler only on inner loops; there-
fore, you can increase performance by creating larger inner loops. One meth-
od for creating large inner loops is to completely unroll inner loops that execute
for a small number of cycles.
In Example 3–29, the compiler pipelines the inner loop with a kernel size of one
cycle; therefore, the inner loop completes a result every cycle. However, the
overhead of filling and draining the software pipeline can be significant, and
other outer-loop code is not software pipelined.
For loops with a simple loop structure, the compiler uses a heuristic to deter-
mine if it should unroll the loop. Because unrolling can increase code size, in
some cases the compiler does not unroll the loop. If you have identified this
loop as being critical to your application, then unroll the inner loop in C code,
as in Example 3–30.
In general unrolling may be a good idea if you have an uneven partition or if
your loop carried dependency bound is greater than the partition bound. (Refer
to section 6.7, Loop Carry Paths and section 3.2 in the TMS320C6000 Opti-
mizing C/C++ Compiler User's Guide . This information can be obtained by us-
ing the –mw option and looking at the comment block before the loop.

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents