Table Footnotes - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

Table Footnotes

The following footnotes refer to all tables in this appendix.
1.
Latency information for many of instructions that are complex
(> 4 μops) are estimates based on conservative and worst-case
estimates. Actual performance of these instructions by the
out-of-order core execution unit can range from somewhat faster to
significantly faster than the nominal latency data shown in these
tables.
2.
The names of execution units apply to processor implementations
of the Intel NetBurst microarchitecture only with CPUID signature
of family 15, model encoding = 0, 1, 2. They include:
,
FP_EXECUTE
FPMOVE
execution units and ports in the out-of-order core. Note the
following:
• The
FP_EXECUTE
roughly consisting of seven separate execution units.
• The
FP_ADD
subtract operation.
• The
FP_MUL
operation.
• The
FP_DIV
square-root operations.
• The
MMX_SHFT
• The
MMX_ALU
• The
MMX_MISC
some integer operations.
• The
FP_MISC
separated from the six units listed above.
3.
It may be possible to construct repetitive calls to some IA-32
instructions in code sequences to achieve latency that is one or two
clock cycles faster than the more realistic number listed in this
table.
IA-32 Instruction Latency and Throughput
,
,
MEM_LOAD
MEM_STORE
unit is actually a cluster of execution units,
unit handles x87 and SIMD floating-point add and
unit handles x87 and SIMD floating-point multiply
unit handles x87 and SIMD floating-point divide
unit handles shift and rotate operations.
unit handles SIMD integer
unit handles reciprocal MMX computations and
designates other execution units in port 1 that are
,
ALU
. See Figure 1-4 for
operations.
ALU
C-19
C

Advertisement

Table of Contents
loading

Table of Contents