Intel PXA270 Optimization Manual page 111

Pxa27x processor family
Table of Contents

Advertisement

Not aligning data on cache line boundaries has the disadvantage of placing the preload address on
the corresponding misaligned address. Consider this example:
struct {
long ia;
long ib;
long ic;
long id;
} tdata[IMAX];
for (i=0, i<IMAX; i++)
{
PREFETCH(tdata[i+1]);
tdata[i].ia = tdata[i].ib + tdata[i].ic + tdata[i].id];
....
tdata[i].id = 0;
}
In this case if tdata[] is not aligned to a cache line, then the prefetch using the address of
tdata[i+1].ia may not include element id. If the array was aligned on a cache line + 12 bytes, then
the prefetch would have to be placed on &tdata[i+1].id.
If the structure is not sized to a multiple of the cache line size, then the preload address must be
advanced appropriately and requires extra prefetch instructions. Consider this example:
struct {
long ia;
long ib;
long ic;
long id;
long ie;
} tdata[IMAX];
ADDRESS predata = &tdata
for (i=0, i<IMAX; i++)
{
PREFETCH(predata+=16);
tdata[I].ia = tdata[I].ib + tdata[I].ic + tdata[I].id] +
tdata[I].ie;
....
tdata[I].ie = 0;
}
In this case, the preload address was advanced by the size of half a cache line and every other
preload instruction is ignored. Further, an additional register is required to track the next preload
address.
Generally, not aligning and sizing data adds extra computational overhead.
Intel® PXA27x Processor Family Optimization Guide
High Level Language Optimization
5-13

Advertisement

Table of Contents
loading

This manual is also suitable for:

Pxa271Pxa272Pxa273

Table of Contents