Optimizations With The Intel ® Performance Libraries - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization
Performance: Highly-optimized routines with a C interface that give
Assembly-level performance in a C/C++ development environment
(MKL also supports a Fortran interface).
Platform tuned: Processor-specific optimizations that yield the best
performance for each Intel processor.
Compatibility: Processor-specific optimizations with a single
application programming interface (API) to reduce development
costs while providing optimum performance.
Threaded application support: Applications can be threaded with the
assurance that the MKL and IPP functions are safe for use in a
threaded environment.
Optimizations with the Intel
The Intel Performance Libraries implement a number of optimizations
that are discussed throughout this manual. Examples include
architecture-specific tuning such as loop unrolling, instruction pairing
and scheduling; and memory management with explicit and implicit
data prefetching and cache tuning.
The Libraries take advantage of the parallelism in the SIMD instructions
using MMX technology, Streaming SIMD Extensions (SSE), Streaming
SIMD Extensions 2 (SSE2), and Streaming SIMD Extensions 3 (SSE3).
These techniques improve the performance of computationally intensive
algorithms and deliver hand coded performance in a high level language
development environment.
For performance sensitive applications, the Intel Performance Libraries
free the application developer from the time consuming task of
assembly-level programming for a multitude of frequently used
functions. The time required for prototyping and implementing new
application features is substantially reduced and most important, the
time to market is substantially improved. Finally, applications
A-16
®
Performance Libraries

Advertisement

Table of Contents
loading

Table of Contents