Intel PXA270 Optimization Manual page 108

Pxa27x processor family

Table of Contents

High Level Language Optimization

for(i=0; i<5; i++){

f(i);}

is converted to the faster equivalent

f(0);

f(1);

f(2);

f(3);

f(4);

This optimization eliminates loop overhead.

Additionally, there is also a method of loop unrolling for loops with an unknown amount of

iterations at compile time. Therefore, the method is often referred to as dynamic loop unrolling. By

breaking down an arbitrary size loop into small unrolled blocks, some loop overhead can be

avoided.

For example, it is unlikely that a compiler will unroll this code.

void f(int nTotalIterations)

{

for(i=0; i<nTotalIterations; i++){

f(i);}

}

Replacing this small loop with the considerably larger code segment below will potentially provide

a significant performance improvement at the expense of code size.

void f(int nTotalIterations)

{

const int nItersPerBlock = 4;

int nTotalBlockIters;

int i;

// find the largest multiple of nItersPerBlock that is less

nTotalBlockIters = (nTotalIterations / nItersPerBlock) *

5-10

than or equal to nTotalIterations

Intel® PXA27x Processor Family Optimization Guide

Table of Contents

Pxa271 Pxa272 Pxa273