Spill Scheduling; Scheduling Rules For The Pentium 4 Processor Decoder - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization

Spill Scheduling

The spill scheduling algorithm used by a code generator will be
impacted by the Pentium 4 processor memory subsystem. A spill
scheduling algorithm is an algorithm that selects what values to spill to
memory when there are too many live values to fit in registers. Consider
the code in Example 2-26, where it is necessary to spill either
Example 2-26 Spill Scheduling Example Code
LOOP
C := ...
B := ...
A := A + ...
For the Pentium 4 processor, using dependence depth information in
spill scheduling is even more important than in previous processors. The
loop- carried dependence in
spilled. Not only would a store/load be placed in the dependence chain,
but there would also be a data-not-ready stall of the load, costing further
cycles.
Assembly/Compiler Coding Rule 62. (H impact, MH generality) For small
loops, placing loop invariants in memory is better than spilling loop-carried
dependencies.
A possibly counter-intuitive result: in such a situation it is better to put
loop invariants in memory than in registers, since loop invariants never
have a load blocked by store data that is not ready.

Scheduling Rules for the Pentium 4 Processor Decoder

The Pentium 4 and Intel Xeon processors have a single decoder that can
decode instructions at the maximum rate of one instruction per clock.
Complex instructions must enlist the help of the microcode ROM; see
Chapter 1, "IA-32 Intel® Architecture Processor Family Overview" for
details.
2-92
makes it especially important that
A
,
, or
.
A
C
B
not be
A

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Table of Contents

Save PDF