Summary Of Store-To-Load Forwarding Pitfalls To Avoid; Stack Alignment Considerations - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization
One Supported Store-
to-Load Forwarding
Case

Summary of Store-to-Load Forwarding Pitfalls to Avoid

Stack Alignment Considerations

Extend to 32 Bits
Before Pushing onto
Stack
54
There is one case of a mismatched store-to-load forwarding that
is supported by the by AMD Athlon processor. The lower 32 bits
from an aligned QWORD write feeding into a DWORD read is
allowed.
Example 8 (Allowed):
MOVQ
[AlignedQword], mm0
...
MOV
EAX, [AlignedQword]
To avoid store-to-load forwarding pitfalls, code should conform
to the following guidelines:
Maintain consistent use of operand size across all loads and
stores. Preferably, use doubleword or quadword operand
sizes.
Avoid misaligned data references.
Avoid narrow-to-wide and wide-to-narrow forwarding cases.
When using word or byte stores, avoid loading data from
anywhere in the same doubleword of memory other than the
identical start addresses of the stores.
Make sure the stack is suitably aligned for the local variable
with the largest base type. Then, using the technique described
in "C Language Structure Component Considerations" on page
55, all variables can be properly aligned with no padding.
Function arguments smaller than 32 bits should be extended to
32 bits before being pushed onto the stack, which ensures that
the stack is always doubleword aligned on entry to a function.
If a function has no local variables with a base type larger than
doubleword, no further work is necessary. If the function does
have lo ca l variables whos e ba se type is la rger than a
doubleword, additional code should be inserted to ensure
proper alignment of the stack. For example, the following code
achieves quadword alignment:
22007E/0—November 1999
Stack Alignment Considerations

Advertisement

Table of Contents
loading

Table of Contents