Scheduling The Tmia Instruction - Intel PXA270 Optimization Manual

Pxa27x processor family
Table of Contents

Advertisement

Intel XScale® Microarchitecture & Intel® Wireless MMX™ Technology Optimization
It is often possible to interleave instructions and effectively overlap their execution with multi-
cycle instructions that utilize the multiply pipe-line. The 2-cycle WMAC instruction may be easily
interleaved with operations which do not utilize the same resources:
WMACS wR14, wR2, wR3
WLDRD wR3, [r4] , #8
WMACS R15, wR1, wR0
WALIGNI wR5, wR6, wR7, #4
WMACS wR15, wR5, wR0
WLDRD wR0, [r3], #8
In the above example, the WLDRD and WALIGNI instructions do not incur a stall since they are
utilizing the memory and execution pipelines respectively and there are no data dependencies.
When utilizing both Intel XScale® Microarchitecture and Intel® Wireless MMX™ Technology
execution resources, it is also possible to overlap the multicycle instructions. The ADD instruction
in the following example executes with no stalls.
WMACS wR14, wR1, wR2
ADD
Refer to
Section 4.8, "Instruction Latencies for Intel XScale® Microarchitecture"
information on instruction latencies for various multiply instructions. The multiply instructions
should be scheduled taking into consideration their respective instruction latencies.
4.3.2.3

Scheduling the TMIA Instruction

The issue latency of the TMIA instruction is one cycle and the result and resource latency are two
cycles. The second TMIA instruction in the following example stalls for one cycle due to the two
cycle resource latency.
TMIA wR0, r2, r3
TMIA wR1, r4, r5
The WADD instruction in the following example stalls for one cycle due to the two cycle result
latency.
TMIA wR0, r2, r3
WADD wR1, wR0, wR2
Refer to
Section 4.8, "Instruction Latencies for Intel XScale® Microarchitecture"
information on instruction latencies for various multiply instructions. The multiply instructions
should be scheduled taking into consideration their respective instruction latencies
4-20
R1, R2, R3
Intel® PXA27x Processor Family Optimization Guide
for more
for more

Advertisement

Table of Contents
loading

This manual is also suitable for:

Pxa271Pxa272Pxa273

Table of Contents