Intel PXA270 Optimization Manual page 5

Pxa27x processor family
Table of Contents

Advertisement

4
4.1
Introduction ........................................................................................................................4-1
4.2
General Optimization Techniques .....................................................................................4-1
4.2.1
Conditional Instructions and Loop Control ............................................................4-1
4.2.2
Program Flow and Branch Instructions.................................................................4-2
4.2.3
Optimizing Complex Expressions .........................................................................4-5
4.2.3.1
4.2.4
Optimizing the Use of Immediate Values..............................................................4-6
4.2.5
Optimizing Integer Multiply and Divide..................................................................4-7
4.2.6
Effective Use of Addressing Modes ......................................................................4-8
4.3
MMX™ Technology ...........................................................................................................4-8
4.3.1
4.3.1.1
4.3.1.2
4.3.1.3
4.3.1.4
4.3.1.5
4.3.1.6
4.3.1.7
4.3.1.8
4.3.1.9
4.3.1.10 Scheduling MRS and MSR Instructions ..............................................4-17
4.3.1.11 Scheduling Coprocessor 15 Instructions ............................................4-18
4.3.2
4.3.2.1
4.3.2.2
4.3.2.3
4.3.2.4
4.4
SIMD Optimization Techniques .......................................................................................4-21
4.4.1
Software Pipelining .............................................................................................4-21
4.4.1.1
4.4.2
Multi-Sample Technique .....................................................................................4-23
4.4.2.1
4.4.3
Data Alignment Techniques................................................................................4-25
4.5
Porting Existing Intel® MMX™ Technology Code to Intel® Wireless MMX™
Technology ......................................................................................................................4-26
4.5.1
Intel® Wireless MMX™ Technology Instruction Mapping...................................4-27
4.5.2
Unsigned Unpack Example ................................................................................4-28
4.5.3
Signed Unpack Example ....................................................................................4-29
4.5.4
Interleaved Pack with Saturation Example .........................................................4-29
4.6
Optimizing Libraries for System Performance .................................................................4-29
4.6.1
Case Study 1: Memory-to-Memory Copy............................................................4-29
4.6.2
Case Study 2: Optimizing Memory Fill................................................................4-30
4.6.3
Case Study 3: Dot Product .................................................................................4-31
4.6.4
Case Study 4: Graphics Object Rotation ............................................................4-32
4.6.5
Case Study 5: 8x8 Block 1/2X Motion Compensation ........................................4-33
4.7
Intel® Performance Primitives .........................................................................................4-34
4.8
Instruction Latencies for Intel XScale® Microarchitecture ...............................................4-35
4.8.1
Performance Terms ............................................................................................4-35
4.8.2
Branch Instruction Timings .................................................................................4-37
Intel® PXA27x Processor Family Optimization Guide
Bit Field Manipulation............................................................................4-6
Scheduling Loads .................................................................................4-8
Increasing Load Throughput ...............................................................4-11
Increasing Store Throughput ..............................................................4-12
Scheduling Load and Store Multiple (LDM/STM)................................4-14
Scheduling Data-Processing...............................................................4-15
Scheduling Multiply Instructions..........................................................4-15
Scheduling SWP and SWPB Instructions ...........................................4-16
Scheduling the WMAC Instructions ....................................................4-19
Scheduling the TMIA Instruction .........................................................4-20
Scheduling the WMUL and WMADD Instructions ...............................4-21
General Remarks on Software Pipelining ...........................................4-23
General Remarks on Multi-Sample Technique ...................................4-25
Contents
v

Advertisement

Table of Contents
loading

This manual is also suitable for:

Pxa271Pxa272Pxa273

Table of Contents