Loop Conditionals - Intel PXA270 Optimization Manual

Pxa27x processor family
Table of Contents

Advertisement

nItersPerBlock;
for (i=0; i<nTotalBlockIters; i+=nItersPerBlock)
{
// unrolling nItersPerBlock times
f(i);
f(i+1);
f(i+2);
f(i+3);}
// any remaining iterations must now be completed
for (; i<nTotalIterations; ++i)
{
f(i);}
}
Carefully choosing a value for nItersPerBlock based on the task (choosing 8, 16, etc., when large
values of nTotalIterations are predicted) increases the benefit of this technique. Again, performance
may potentially decline if the instructions within the unrolled block do not fit in the instruction
cache. Ensure that all inline functions, inline procedures, and macros used within the block fit
within the instruction cache.
5.1.7

Loop Conditionals

Another simple optimization increases the performance of tight loops. When possible, using a
decrementing counter that approaches zero can provide a significant performance increase.
For example, here is a typical for() loop.
for (i=0; i<1000; ++i)
{
p1();}
This code provides the same behavior without as much loop overhead.
for (i=1000; i>0; --i)
{
p1();}
Intel® PXA27x Processor Family Optimization Guide
High Level Language Optimization
5-11

Advertisement

Table of Contents
loading

This manual is also suitable for:

Pxa271Pxa272Pxa273

Table of Contents