Align Branch Targets In Program Hot Spots; Use Short Instruction Lengths - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization

Align Branch Targets in Program Hot Spots

Use Short Instruction Lengths

36
Example 1 (Avoid):
FLD
QWORD PTR [foo]
FIMUL
DWORD PTR [bar]
FIADD
DWORD PTR [baz]
Example 2 (Preferred):
FILD
DWORD PTR [bar]
FILD
DWORD PTR [baz]
FLD
QWORD PTR [foo]
FMULP
ST(2), ST
FADDP
ST(1),ST
In program hot spots (i.e., innermost loops in the absence of
profiling data), place branch targets at or near the beginning of
16-byte aligned code windows. This technique helps to
maximize the number of instructions that are filled into the
instruction-byte queue while preventing I-cache space in
branch intensive code.
Assemblers and compilers should generate the tightest code
possible to optimize use of the I-cache and increase average
decode rate. Wherever possible, use instructions with shorter
lengths. Using shorter instructions increases the number of
instructions that can fit into the instruction-byte queue. For
example, use 8-bit displacements as opposed to 32-bit
displacements. In addition, use the single-byte format of simple
integer instructions whenever possible, as opposed to the
2-byte opcode ModR/M format.
Example 1 (Avoid):
81 C0 78 56 34 12
81 C3 FB FF FF FF
0F 84 05 00 00 00
add eax, 12345678h ;uses 2-byte opcode
add ebx, -5
jz
$label1
Align Branch Targets in Program Hot Spots
22007E/0—November 1999
; form (with ModR/M)
;uses 32-bit
; immediate
;uses 2-byte opcode,
; 32-bit immediate

Advertisement

Table of Contents
loading

Table of Contents