Use Movzx And Movsx; Minimize Pointer Arithmetic In Loops - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

22007E/0—November 1999

Use MOVZX and MOVSX

Minimize Pointer Arithmetic in Loops

Use MOVZX and MOVSX
Example 1 (Avoid):
ADD EBX, ECX
MOV EAX, DWORD PTR [10h]
MOV ECX, DWORD PTR [EAX+EBX]
MOV EDX, DWORD PTR [24h]
Example 2 (Preferred):
ADD EBX, ECX
MOV EAX, DWORD PTR [10h]
MOV EDX, DWORD PTR [24h]
MOV ECX, DWORD PTR [EAX+EBX]
Use the MOVZX and MOVSX instructions to zero-extend and
sign-extend byte-size and word-size operands to doubleword
length. For example, typical code for zero extension creates a
superset dependency when the zero-extended value is used, as
in the following code:
Example 1 (Avoid):
XOR
EAX, EAX
MOV
AL, [MEM]
Example 2 (Preferred):
MOVZX
EAX, BYTE PTR [MEM]
Minimize pointer arithmetic in loops, especially if the loop
body is small. In this case, the pointer arithmetic would cause
significant overhead. Instead, take advantage of the complex
addressing modes to utilize the loop counter to index into
memory arrays. Using complex addressing modes does not have
any negative impact on execution speed, but the reduced
number of instructions preserves decode bandwidth.
AMD Athlon™ Processor x86 Code Optimization
;inst 1
;inst 2 (fast address calc.)
;inst 3 (slow address calc.)
;this load is stalled from
; accessing data cache due
; to long latency for
; generating address for
; inst 3
;inst 1
;inst 2
;place load above inst 3
; to avoid address
; generation interlock stall
;inst 3
73

Advertisement

Table of Contents
loading

Table of Contents