If-Else Versus Switch Statements; Nested If-Else And Switch Statements; Locality In Source Code; Choosing Data Types - Intel PXA270 Optimization Manual

Pxa27x processor family
Table of Contents

Advertisement

High Level Language Optimization
For the first loop, most compilers generate an instruction to subtract i from 1000. An instruction is
then created to compare the result with 0. In the second loop, a subtraction instruction is not
needed, and the 1000 constant does not need to be stored in a register. Freeing that register might
save a stack operation for every iteration of the loop, significantly enhancing performance.
5.1.8

If-else versus Switch Statements

Compilers can often generate jump tables for switch statements that can jump to specific code
faster than traversing the conditionals of a cascading if-else statement. Some compilers, however,
may simply expand the switch statement into a cascading if-else statement. In general, using switch
statements where possible and always placing the most frequently traversed paths higher up in
either the switch statement or the cascading if-else code leads to the best optimization by the
compiler.
5.1.9

Nested If-Else and Switch Statements

Using nested if-else and switch statements can greatly reduce the number of comparison
instructions that are generated by the compiler. For example, consider a switch statement
containing 256-case statements. Without knowing if the compiler will generate a jump table or a
cascading if-else statement, the processor might potentially have to do 256 comparisons only to
find that not a single conditional is met.
By breaking the switch into two or more levels, the worst case lookup is dramatically reduced.
Using a switch statement with 16-case statements to jump to 16 other switch statements each with
16 cases reduces the non-existing case lookup to 16 comparisons and the worst case lookup to 32
comparisons.
5.1.10

Locality in Source Code

On many different levels, code that is cohesive, modular, and decoupled allow compilers to
optimize the code to the greatest extent. In C++, these attributes are encouraged by the language. In
C, it is very important to keep closely related code and data definitions in the same file as much as
possible. The compiler can more efficiently optimize the code this way, and similar data has a
higher degree of spatial locality to make better use of the data cache.
5.1.11

Choosing Data Types

Many applications inherently use sub-word data sizes, by packing a set of them into a single word
is beneficial for memory accesses and memory bandwidth. Packed data formats can also be
processed using the Intel® Wireless MMX™ Technology. The Intel XScale® Microarchitecture
performs best on word-size data aligned on a 4-byte boundary. Intel® Wireless MMX™
Technology requires data to be aligned on a 8-byte boundary.
5.1.12

Data Alignment For Maximizing Cache Usage

Cache lines begin on 32-byte address boundaries. To maximize cache line use and minimize cache
pollution, data structures should be aligned on 32-byte boundaries and sized to a multiple of the
cache line sizes. Aligning data structures on cache address boundaries simplifies later addition of
preload instructions to optimize performance.
5-12
Intel® PXA27x Processor Family Optimization Guide

Advertisement

Table of Contents
loading

This manual is also suitable for:

Pxa271Pxa272Pxa273

Table of Contents