Instruction Scheduling; Scheduling Loads - Intel IXP45X Developer's Manual

Network processors
Table of Contents

Advertisement

Intel
for(i=0; i<NMAX; i++)
{
prefetch(A[i+1], c[i+1], c[i+1]);
A[i] = b[i] + c[i];
}
for(i=0; i<NMAX; i++)
{
prefetch(D[i+1], c[i+1], A[i+1]);
D[i] = A[i] + c[i];
}
The second loop reuses the data elements A[i] and c[i]. Fusing the loops together
produces:
for(i=0; i<NMAX; i++)
{
prefetch(D[i+1], A[i+1], c[i+1], b[i+1]);
ai = b[i] + c[i];
A[i] = ai;
D[i] = ai + c[i];
}
3.10.4.4.11 Prefetch to Reduce Register Pressure
Pre-fetch can be used to reduce register pressure. When data is needed for an
operation, then the load is scheduled far enough in advance to hide the load latency.
However, the load ties up the receiving register until the data can be used. For
example:
ldr
; Process code { not yet cached latency > 60 core clocks }
add
In the above case, r2 is unavailable for processing until the add statement. Prefetching
the data load frees the register for use. The example code becomes:
pld
; Process code
ldr
; Process code { ldr result latency is 3 core clocks }
add
With the added prefetch, register r2 can be used for other operations until almost just
before it is needed.
3.10.5

Instruction Scheduling

This section discusses instruction scheduling optimizations. Instruction scheduling
refers to the rearrangement of a sequence of instructions for the purpose of minimizing
pipeline stalls. Reducing the number of pipeline stalls improves application
performance. While making this rearrangement, care should be taken to ensure that
the rearranged sequence of instructions has the same effect as the original sequence of
instructions.
3.10.5.1

Scheduling Loads

On the IXP45X/IXP46X network processors, an LDR instruction has a result latency of
three cycles assuming the data being loaded is in the data cache. If the instruction
after the LDR needs to use the result of the load, then it would stall for 2 cycles. If
possible, the instructions surrounding the LDR instruction should be rearranged.
®
®
Intel
IXP45X and Intel
IXP46X Product Line of Network Processors
Developer's Manual
212
®
®
IXP45X and Intel
IXP46X Product Line of Network Processors—Intel XScale
r2, [r0]
r1, r1, r2
[r0] ;prefetch the data keeping r2 available for use
r2, [r0]
r1, r1, r2
®
Processor
August 2006
Order Number: 306262-004US

Advertisement

Table of Contents
loading

This manual is also suitable for:

Ixp46x

Table of Contents