Flush-To-Zero And Denormals-Are-Zero Modes; Simd Floating-Point Programming Using Sse3 - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization
avoided since there is a penalty associated with writing this register;
typically, through the use of the
the rounding control in

Flush-to-Zero and Denormals-are-Zero Modes

The flush-to-zero (FTZ) and denormals-are-zero (DAZ) mode are not
compatible with IEEE Standard 754. They are provided to improve
performance for applications where underflow is common and where
the generation of a denormalized result is not necessary. See
"Floating-point Modes and Exceptions" in Chapter 2.

SIMD Floating-point Programming Using SSE3

SSE3 enhances SSE and SSE2 with 9 instructions targeted for SIMD
floating-point programming. In contrast to many SSE and SSE2
instructions offering homogeneous arithmetic operations on parallel
data elements (see Figure 5-1) and favoring the vertical computation
model, SSE3 offers instructions that performs asymmetric arithmetic
operation and arithmetic operation on horizontal data elements.
ADDSUBPS and ADDSUBPD are two instructions with asymmetric
arithmetic processing capability (see Figure 5-4). HADDPS, HADDPD,
HSUBPS and HSUBPD offers horizontal arithmetic processing
capability (see Figure 5-5). In addition, MOVSLDUP, MOVSHDUP
and MOVDDUP can load data from memory (or XMM register) and
replicate data elements at once.
5-22
cvttps2pi
can be always be set to round-nearest.
MXCSR
and
instructions,
cvttss2si

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Subscribe to Our Youtube Channel

Table of Contents

Save PDF