AMD Athlon Processor x86 Optimization Manual page 68

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization
Narrow-to-Wide
Store-Buffer Data
Forwarding
Restriction
Wide-to-Narrow
Store-Buffer Data
Forwarding
Restriction
52
I f t h e f o l l o w i n g c o n d i t i o n s a re p re s e n t , t h e re i s a
narrow-to-wide store-buffer data forwarding restriction:
The operand size of the store data is smaller than the
operand size of the load data.
The range of addresses spanned by the store data covers
some sub-region of range of addresses spanned by the load
data.
Avoid the type of code shown in the following two examples.
Example 1 (Avoid):
MOV EAX, 10h
MOV WORD PTR [EAX], BX
...
MOV ECX, DWORD PTR [EAX]
Example 2 (Avoid):
MOV EAX, 10h
MOV BYTE PTR [EAX + 3], BL ;byte store
...
MOV ECX, DWORD PTR [EAX]
I f t h e f o l l o w i n g c o n d i t i o n s a re p re s e n t , t h e re i s a
wide-to-narrow store-buffer data forwarding restriction:
The operand size of the store data is greater than the
operand size of the load data.
The start address of the store data does not match the start
address of the load.
Example 3 (Avoid):
MOV EAX, 10h
ADD DWORD PTR [EAX], EBX
MOV CX, WORD PTR [EAX + 2] ;word load-cannot forward high
Use example 5 instead of example 4.
Example 4 (Avoid):
MOVQ
[foo], MM1
...
ADD
EAX, [foo]
ADD
EDX, [foo+4]
;word store
;doubleword load
;cannot forward upper
; byte from store buffer
;doubleword load
;cannot forward upper byte
; from store buffer
;doubleword store
; word from store buffer
;store upper and lower half
;fine
;uh-oh!
Store-to-Load Forwarding Restrictions
22007E/0—November 1999

Advertisement

Table of Contents
loading

Table of Contents