AMD Athlon Processor x86 Optimization Manual page 117

X86 code optimization
Table of Contents

Advertisement

22007E/0—November 1999
Minimize Floating-Point-to-Integer Conversions
FP U into tr uncating mo de, and perfor ming all of the
conversions before restoring the original control word.
The speed of the above code is somewhat dependent on the
nature of the code surrounding it. For applications in which the
speed of floating-point-to-integer conversions is extremely
critical for application performance, experiment with either of
the following substitutions, which may or may not be faster than
the code above.
The first substitution simulates a truncating floating-point to
integer conversion provided that there are no NaNs, infinities,
and overflows. This conversion is therefore not IEEE-754
compliant. This code works properly only if the current FPU
rounding mode is round-to-nearest-even, which is usually the
case.
Example 2 (Potentially faster).
FLD
QWORD PTR [X]
FST
DWORD PTR [TX]
FIST
DWORD PTR [I]
FISUB
DWORD PTR [I]
FSTP
DWORD PTR [DIFF]
MOV
EAX, [TX]
MOV
EDX, [DIFF]
TEST
EDX, EDX
JZ
$DONE
XOR
EDX, EAX
SAR
EAX, 31
SAR
EDX, 31
LEA
EAX, [EAX+EAX+1]
AND
EDX, EAX
SUB
[I], EDX
$DONE:
The second substitution simulates a truncating floating-point to
integer conversion using only integer instructions and therefore
works correctly independent of the FPUs current rounding
mode. It does not handle NaNs, infinities, and overflows
according to the IEEE-754 standard. Note that the first
instruction of this code may cause an STLF size mismatch
resulting in performance degradation if the variable to be
converted has been stored recently.
AMD Athlon™ Processor x86 Code Optimization
;load double to be converted
;store X because sign(X) is needed
;store rndint(x) as default result
;compute DIFF = X - rndint(X)
;store DIFF as we need sign(DIFF)
;X
;DIFF
;DIFF == 0 ?
;default result is OK, done
; need correction if sign(X) != sign(DIFF)
;(X<0) ? 0xFFFFFFFF : 0
; sign(X)!=sign(DIFF)?0xFFFFFFFF:0
;(X<0) ? 0xFFFFFFFF : 1
;correction: -1, 0, 1
;trunc(X)=rndint(X)-correction
101

Advertisement

Table of Contents
loading

Table of Contents