Optimize Floating-Point Performance; Optimize Instruction Selection - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization
Minimize use of global variables and pointers.
Use the
variables.
Use new cacheability instructions and memory-ordering behavior.

Optimize Floating-point Performance

Avoid exceeding representable ranges during computation, since
handling these cases can have a performance impact. Do not use a
larger precision format (double-extended floating point) unless
required, since this increases memory size and bandwidth
utilization.
Use FISTTP to avoid changing rounding mode when possible or use
optimized
registers (rounding modes) between more than two values.
Use efficient conversions, such as those that implicitly include a
rounding mode, in order to avoid changing control/status registers.
Take advantage of the SIMD capabilities of Streaming SIMD
Extensions (SSE) and of Streaming SIMD Extensions 2 (SSE2)
instructions. Enable flush-to-zero mode and DAZ mode when using
SSE and SSE2 instructions.
Avoid denormalized input values, denormalized output values, and
explicit constants that could cause denormal exceptions.
Avoid excessive use of the

Optimize Instruction Selection

Focus instruction selection at the granularity of path length for a
sequence of instructions versus individual instruction selections;
minimize the number of uops, data/register dependency in
aggregates of the path length, and maximize retirement throughput.
2-6
modifier; use the
const
; avoid changing floating-point control/status
fldcw
modifier for global
static
instruction.
fxch

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Subscribe to Our Youtube Channel

Table of Contents

Save PDF