Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 168

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

3.5
Optimization of Memory References
Speculation can increase parallelism and help to hide latency by enabling more code
motion than can be performed on traditional architectures. Speculation can increase the
application of traditional loop optimizations such as invariant code motion and common
subexpression elimination. The Itanium architecture also offers post-increment loads
and stores that improve instruction throughput without increasing code size.
Memory reference optimization should take several factors into account including:
• Difference between the execution costs of speculative and non-speculative code.
• Code size.
• Interference probabilities and properties of the ALAT (for data speculation).
The remainder of this chapter discusses these factors and optimizations relating to
memory accesses.
3.5.1
Speculation Considerations
The use of data speculation requires more attention than the use of control speculation.
In part this is due to the fact that one control speculative load cannot inadvertently
cause another control speculative load to fail. Such an effect is possible with data
speculative loads since the ALAT has limited capacity and the replacement policy of
ALAT entries is implementation dependent. For example, if an advanced load is issued
and there are no unused ALAT entries, the hardware may choose to invalidate an
existing entry to make room for a new one.
Moreover, exceptions associated with control speculative calculations are uncommon in
correct code since they are related to events such as page faults and TLB misses.
However, excessive control speculation can be expensive as associated instructions fill
issue slots.
Although the static critical path of a program may be reduced by the use of data
speculation, the following factors contribute to the benefit/dynamic cost of data
speculation:
• The probability that an intervening store will interfere with an advanced load.
• The cost of recovering from a failed advanced load.
• The specific microarchitectural implementation of the ALAT: its size, associativity,
and matching algorithm.
Determining interference probabilities can be difficult, but dynamic memory profiling
can help to predict how often ambiguous loads and stores will conflict.
When using advanced loads, there should be case-by-case consideration as to whether
advancing only a load and using a ld.c might be preferable to advancing both a load
and its uses, which would require the use of the potentially more expensive chk.a.
Even when recovery code is not executed, its presence extends the lifetimes of
registers used in data and control speculation, thus increasing register pressure and
possibly the cost of register movement by the Register Stack Engine (RSE). See
Section 3.5.3
Volume 1, Part 2: Memory Reference
for information on considerations for recovery code placement.
1:157

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents