Intel ARCHITECTURE IA-32 Reference Manual page 527

Architecture optimization
Table of Contents

Advertisement

Table C-4
Streaming SIMD Extension Single-precision Floating-point
Instructions (continued)
Instruction
3
MOVLHPS
xmm, xmm
MOVMSKPS r32, xmm
MOVSS xmm, xmm
MOVUPS xmm, xmm
MULPS xmm, xmm
MULSS xmm, xmm
3
ORPS
xmm, xmm
3
RCPPS
xmm, xmm
3
RCPSS
xmm, xmm
3
RSQRTPS
xmm, xmm
3
RSQRTSS
xmm, xmm
3
SHUFPS
xmm, xmm,
imm8
SQRTPS xmm, xmm
SQRTSS xmm, xmm
SUBPS xmm, xmm
SUBSS xmm, xmm
UCOMISS xmm, xmm
3
UNPCKHPS
xmm,
xmm
3
UNPCKLPS
xmm,
xmm
3
XORPS
xmm, xmm
FXRSTOR
FXSAVE
See "Table Footnotes"
1
Latency
4
4
6
6
4
4
6
6
7
6
4+1
7
6
4
4
2
6
6
2
6
6
1
6
6
2
6
6
6
6
2
40
39
29+28
32
23
30
5
4
4
5
4
3
7
6
1
6
6
3
4
4
3
4
4
2
150
100
IA-32 Instruction Latency and Throughput
Throughput
2
2
2
2
2
2
1
1
2
2
2
2
2
2
2
2
4
4
2
2
2
1
4
4
2
4
4
1
2
2
2
40
39
58
32
23
29
2
2
2
2
2
1
2
2
1
2
2
2
2
2
2
2
2
2
C
2
Execution Unit
MMX_SHFT
FP_MISC
MMX_SHFT
FP_MOVE
FP_MUL
FP_MUL
MMX_ALU
MMX_MISC
MMX_MISC,
MMX_SHFT
MMX_MISC
MMX_MISC,
MMX_SHFT
MMX_SHFT
FP_DIV
FP_DIV
FP_ADD
FP_ADD
FP_ADD,
FP_MISC
MMX_SHFT
MMX_SHFT
MMX_ALU
C-13

Advertisement

Table of Contents
loading

Table of Contents