Behavioral Description; Basic Optimizations; Conditional Instructions; Optimizing Condition Checks - Intel PXA255 User Manual

Xscale microarchitecture
Hide thumbs Also See for PXA255:
Table of Contents

Advertisement

A.2.5.1.

Behavioral Description

The execution of the MAC unit starts at the beginning of the M1 pipestage, where it receives two
32-bit source operands. Results are completed N cycles later (where N is dependent on the operand
size) and returned to the register file. For more information on MAC instruction latencies, refer to
Section 11.2, "Instruction
An instruction that occupies the M1 or M2 pipestages will also occupy the X1 and X2 pipestage,
respectively. Each cycle, a MAC operation progresses for M1 to M5. A MAC operation may
complete anywhere from M2-M5. If a MAC operation enters M3-M5, it is considered committed
because it will modify architectural state regardless of subsequent events.
A.3

Basic Optimizations

This chapter outlines optimizations specific to the ARM* architecture. These optimizations have
been modified to suit the Intel® XScale™ core where needed.
A.3.1

Conditional Instructions

The Intel® XScale™ core architecture provides the ability to execute instructions conditionally.
This feature combined with the ability of the Intel® XScale™ core instructions to modify the
condition codes makes possible a wide array of optimizations.
A.3.1.1.

Optimizing Condition Checks

The Intel® XScale™ core instructions can selectively modify the state of the condition codes.
When generating code for if-else and loop conditions it is often beneficial to make use of this
feature to set condition codes, thereby eliminating the need for a subsequent compare instruction.
Consider the C code segment:
if (a + b)
Code generated for the if condition without using an add instruction to set condition codes is:
;Assume r0 contains the value a, and r1 contains the value b
add
cmp
However, code can be optimized as follows making use of an ADD instruction to set condition
codes:
;Assume r0 contains the value a, and r1 contains the value b
adds
The instructions that increment or decrement the loop counter can also be used to modify the
condition codes. This eliminates the need for a subsequent compare instruction. A conditional
branch instruction can then be used to exit or continue with the next loop iteration.
Consider the following C code segment:
for (i = 10; i != 0; i--)
{
do something;
}
Intel® XScale™ Microarchitecture User's Manual
Latencies".
r0,r0,r1
r0, #0
r0,r0,r1
Optimization Guide
A-7

Advertisement

Table of Contents
loading

Table of Contents