Scheduling The Mia And Miaph Instructions; Scheduling Mrs And Msr Instructions - Intel PXA255 User Manual

Xscale microarchitecture
Hide thumbs Also See for PXA255:
Table of Contents

Advertisement

Optimization Guide
The stalls incurred by the code shown above can be prevented by rearranging the code:
mra
add
mov
mov
The MAR (MCRR) instruction has an issue latency, a result latency, and a resource latency of 2
cycles. Due to the 2-cycle issue latency, the pipeline would always stall for 1 cycle following a
MAR instruction. The use of the MAR instruction should, therefore, be used only where
absolutely necessary.
A.5.6

Scheduling the MIA and MIAPH Instructions

The MIA instruction has an issue latency of 1 cycle. The result and resource latency can vary from
1 to 3 cycles depending on the values in the source register.
Consider the following code sample:
mia
mia
The second MIA instruction above can stall from 0 to 2 cycles depending on the values in the
registers r2 and r3 due to the 1 to 3 cycle resource latency.
Similarly, consider the following code sample:
mia
mra
The MRA instruction above can stall from 0 to 2 cycles depending on the values in the registers r2
and r3 due to the 1 to 3 cycle result latency.
The MIAPH instruction has an issue latency of 1 cycle, result latency of 2 cycles and a resource
latency of 2 cycles.
Consider the code sample shown below:
add
miaph acc0, r3, r4
miaph acc0, r5, r6
mra
sub
The second MIAPH instruction would stall for 1-cycle due to a 2-cycle resource latency. The
MRA instruction would stall for 1-cycle due to a 2-cycle result latency. These stalls can be avoided
by rearranging the code as follows:
miaph acc0, r3, r4
add
miaph acc0, r5, r6
sub
mra
A.5.7

Scheduling MRS and MSR Instructions

The MRS instruction has an issue latency of 1 cycle and a result latency of 2 cycles. The MSR
instruction has an issue latency of 2 cycles (6 if updating the mode bits) and a result latency of 1
cycle.
A-30
r6, r7, acc0
r2, r2, #1
r0, r6
r1, r7
acc0, r2, r3
acc0, r4, r5
acc0, r2, r3
r4, r5, acc0
r1, r2, r3
r6, r7, acc0
r8, r3, r4
r1, r2, r3
r8, r3, r4
r6, r7, acc0
Intel® XScale™ Microarchitecture User's Manual

Advertisement

Table of Contents
loading

Table of Contents