Store-Forwarding Restriction On Data Availability - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization

Store-forwarding Restriction on Data Availability

The value to be stored must be available before the load operation can
be completed. If this restriction is violated, the execution of the load will
be delayed until the data is available. This delay causes some execution
resources to be used unnecessarily, and that can lead to sizable but
non-deterministic delays. However, the overall impact of this problem is
much smaller than that from size and alignment requirement violations.
The Pentium 4 and Intel Xeon processors predict when loads are both
dependent on and get their data forwarded from preceding stores. These
predictions can significantly improve performance. However, if a load is
scheduled too soon after the store it depends on or if the generation of
the data to be stored is delayed, there can be a significant penalty.
There are several cases where data is passed through memory, where the
store may need to be separated from the load:
spills, save and restore registers in a stack frame
parameter passing
global and volatile variables
type conversion between integer and floating point
when compilers do not analyze code that is inlined, forcing
variables that are involved in the interface with inlined code to be in
memory, creating more memory variables and preventing the
elimination of redundant loads
Assembly/Compiler Coding Rule 22. (H impact, MH generality) Where it
is possible to do so without incurring other penalties, prioritize the allocation
of variables to registers, as in register allocation and for parameter passing to
minimize the likelihood and impact of store- forwarding problems. Try not to
store-forward data generated from a long latency instruction, e.g.
Avoid store-forwarding data for variables with the shortest store-load distance.
Avoid store-forwarding data for variables with many and/or long dependence
chains, and especially avoid including a store forward on a loop-carried
dependence chain.
2-38
.
mul, div

Advertisement

Table of Contents
loading

Table of Contents