AMD Athlon Processor x86 Optimization Manual page 78

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization
Example 2:
Example 3:
62
C code:
float x,z;
z = abs(x);
if (z >= 1) {
z = 1/z;
}
3DNow! code:
;in:
MM0 = x
;out: MM0 = z
MOVQ
MM5, mabs
PAND
MM0, MM5
PFRCP
MM2, MM0
MOVQ
MM1, MM0
PFRCPIT1 MM0, MM2
PFRCPIT2 MM0, MM2
PFMIN
MM0, MM1
C code:
float x,z,r,res;
z = fabs(x)
if (z < 0.575) {
res = r;
}
else {
res = PI/2 - 2*r;
}
3DNow! code:
;in:
MM0 = x
;
MM1 = r
;out: MM0 = res
MOVQ
MM7, mabs ;mask for absolute value
PAND
MM0, MM7
MOVQ
MM2, bnd
PCMPGTD MM2, MM0
MOVQ
MM3, pio2 ;pi/2
MOVQ
MM0, MM1
PFADD
MM1, MM1
PFSUBR
MM1, MM3
PAND
MM0, MM2
PANDN
MM2, MM1
POR
MM0, MM2
;0x7fffffff
;z=abs(x)
;1/z approx
;save z
;1/z step
;1/z final
;z = z < 1 ? z : 1/z
;z = abs(x)
;0.575
;z < 0.575 ? 0xffffffff : 0
;save r
;2*r
;pi/2 - 2*r
;z < 0.575 ? r : 0
;z < 0.575 ? 0 : pi/2 - 2*r
;z < 0.575 ? r : pi/2 - 2 * r
Replace Branches with Computation in 3DNow!™ Code
22007E/0—November 1999

Advertisement

Table of Contents
loading

Table of Contents