Optimize Branch Predictability; Optimize Memory Access - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

Optimize Branch Predictability

Improve branch predictability and optimize instruction prefetching
by arranging code to be consistent with the static branch prediction
assumption: backward taken and forward not taken.
Avoid mixing near calls, far calls and returns.
Avoid implementing a call by pushing the return address and
jumping to the target. The hardware can pair up call and return
instructions to enhance predictability.
Use the
Inline functions according to coding recommendations.
Whenever possible, eliminate branches.
Avoid indirect calls.

Optimize Memory Access

Observe store-forwarding constraints.
Ensure proper data alignment to prevent data split across cache line.
boundary. This includes stack and passing parameters.
Avoid mixing code and data (self-modifying code).
Choose data types carefully (see next bullet below) and avoid type
casting.
Employ data structure layout optimization to ensure efficient use of
64-byte cache line size.
Favor parallel data access to mask latency over data accesses with
dependency that expose latency.
For cache-miss data traffic, favor smaller cache-miss strides to
avoid frequent DTLB misses.
Use prefetching appropriately.
Use the following techniques to enhance locality: blocking,
hardware-friendly tiling, loop interchange, loop skewing.
instruction in spin-wait loops.
pause
General Optimization Guidelines
2
2-5

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Questions and answers

Table of Contents