Branch Prediction; Eliminating Branches - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

Branch Prediction

Branch optimizations have a significant impact on performance. By
understanding the flow of branches and improving the predictability of
branches, you can increase the speed of code significantly.
Optimizations that help branch prediction are:
Keep code and data on separate pages (a very important item, see
more details in the "Memory Accesses" section).
Whenever possible, eliminate branches.
Arrange code to be consistent with the static branch prediction
algorithm.
Use the
Inline functions and pair up calls and returns.
Unroll as necessary so that repeatedly-executed loops have sixteen
or fewer iterations, unless this causes an excessive code size
increase.
Separate branches so that they occur no more frequently than every
three

Eliminating Branches

Eliminating branches improves performance because it:
reduces the possibility of mispredictions
reduces the number of required branch target buffer (BTB) entries;
conditional branches, which are never taken, do not consume BTB
resources
There are four principal ways of eliminating branches:
arrange code to make basic blocks contiguous
unroll loops, as discussed in the "Loop Unrolling" section
use the
use the
instruction in spin-wait loops.
pause
μ
ops where possible.
instruction
cmov
instruction
setcc
General Optimization Guidelines
2
2-15

Advertisement

Table of Contents
loading

Table of Contents