Floating-Point Stalls; X87 Floating-Point Operations With Integer Operands; X87 Floating-Point Comparison Instructions; Transcendental Functions - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization

Floating-Point Stalls

Floating-point instructions have a latency of at least two cycles. But,
because of the out-of-order nature of Pentium II and the subsequent
processors, stalls will not necessarily occur on an instruction or µop
basis. However, if an instruction has a very long latency such as an
, then scheduling can improve the throughput of the overall
fdiv
application.

x87 Floating-point Operations with Integer Operands

For Pentium 4 processor, splitting floating-point operations (
,
fisub
instructions (
However, for floating-point operations with 32-bit integer operands,
using
fiadd
with using separate instructions.
Assembly/Compiler Coding Rule 36. (M impact, L generality) Try to use
32-bit operands rather than 16-bit operands for
at the expense of introducing a store forwarding problem by writing the two
halves of the 32-bit memory operand separately.

x87 Floating-point Comparison Instructions

On Pentium II and the subsequent processors, the
instructions should be used when performing floating-point
comparisons. Using (
requires additional instruction like
μ
more

Transcendental Functions

If an application needs to emulate math functions in software due to
performance or other reasons (see the "Guidelines for Optimizing
Floating-point Code" section), it may be worthwhile to inline math
library calls because the
such calls can significantly affect the latency of operations.
2-72
, and
fimul
fidiv
and a floating-point operation) is more efficient.
fild
,
,
fisub
fimul
fcom
ops to be decoded, and should be avoided.
) that take 16-bit integer operands into two
, and
is equally efficient compared
fidiv
,
,
fcomp
fcompp
. The latter alternative causes
fstsw
and the prologue/epilogue involved with
call
However, do not do so
fild.
and
fcomi
) instructions typically
,
fiadd
fcmov

Advertisement

Table of Contents
loading

Table of Contents