Example 2-12 Several Situations Of Small Loads After Large Store - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

A load that forwards from a store must wait for the store's data to be
written to the store buffer before proceeding, but other, unrelated loads
need not wait.
Assembly/Compiler Coding Rule 20. (H impact, ML generality) If it is
necessary to extract a non-aligned portion of stored data, read out the smallest
aligned portion that completely contains the data and shift/mask the data as
necessary.
This is better than incurring the penalties of a failed store-forward.
Assembly/Compiler Coding Rule 21. (MH impact, ML generality) Avoid
several small loads after large stores to the same area of memory by using a
single large read and register copies as needed.
Example 2-12 contains several store-forwarding situations when small
loads follow large stores. The first three load operations illustrate the
situations described in Rule 22. However, the last load operation gets
data from store-forwarding without problem.

Example 2-12 Several Situations of Small Loads After Large Store

mov [EBP],'abcd'
mov AL, [EBP]
mov BL, [EBP + 1]
mov CL, [EBP + 2]
mov DL, [EBP + 3]
mov AL, [EBP]
Example 2-13 illustrates a store-forwarding situation when a large load
follows after several small stores. The data needed by the load operation
cannot be forwarded because all of the data that needs to be forwarded is
not contained in the store buffer. Avoid large loads after small stores to
the same area of memory.
; not blocked - same alignment
; blocked
; blocked
; blocked
; not blocked - same alignment
; n.b. passes older blocked loads
General Optimization Guidelines
2
2-35

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Table of Contents

Save PDF