Use Load-Execute Floating-Point Instructions With Floating-Point Operands; Avoid Load-Execute Floating-Point Instructions With Integer Operands - AMD Athlon Processor x86 Optimization Manual

X86 code optimization

page of 256

/ 256
Contents
Table of Contents
Bookmarks

Table of Contents

22007E/0—November 1999

Use Load-Execute Floating-Point Instructions with Floating-Point

Operands

TOP

Avoid Load-Execute Floating-Point Instructions with Integer Operands

TOP

Load-Execute Instruction Usage

When operating on single-precision or double-precision

floating-point data, wherever possible use floating-point

load-execute instructions to increase code density.

Note: This optimization applies only to floating-point instructions

with floating-point operands and not with integer operands,

as described in the next optimization.

This coding style helps in two ways. First, denser code allows

more work to be held in the instruction cache. Second, the

denser code generates fewer internal OPs and, therefore, the

FPU scheduler holds more work, which increases the chances of

extracting parallelism from the code.

Example 1 (Avoid):

FLD

QWORD PTR [TEST1]

FLD

QWORD PTR [TEST2]

FMUL

ST, ST(1)

Example 2 (Preferred):

FLD

QWORD PTR [TEST1]

FMUL

QWORD PTR [TEST2]

Do not use load-execute floating-point instructions with integer

operands: FIADD, FISUB, FISUBR, FIMUL, FIDIV, FIDIVR,

F I C O M , a n d F I C O M P. R e m e m b e r t h a t f l o a t i n g -p o i n t

ins truc tions can have int ege r ope rands while int ege r

instruction cannot have floating-point operands.

Floating-point computations involving integer-memory

operands should use separate FILD and arithmetic instructions.

This optimization has the potential to increase decode

bandwidth and OP density in the FPU scheduler. The floating-

point load-execute instructions with integer operands are

VectorPath and generate two OPs in a cycle, while the discrete

equivalent enables a third DirectPath instruction to be decoded

in the same cycle. In some situations this optimizations can also

reduce execution time if the FILD can be scheduled several

instructions ahead of the arithmetic instruction in order to

cover the FILD latency.

AMD Athlon™ Processor x86 Code Optimization

Table of Contents

Need help?

Do you have a question about the Athlon Processor x86 and is the answer not in the manual?

Use Load-Execute Floating-Point Instructions With Floating-Point Operands; Avoid Load-Execute Floating-Point Instructions With Integer Operands - AMD Athlon Processor x86 Optimization Manual

Avoid Load-Execute Floating-Point Instructions with Integer Operands

Need help?

Related Manuals for AMD Athlon Processor x86

Related Products for AMD Athlon Processor x86

Table of Contents