Assembly
Key loops can be coded directly in assembly language using an
assembler or by using inlined assembly (C-asm) in C/C++ code. The
Intel compiler or assembler recognize the new instructions and registers,
then directly generate the corresponding code. This model offers the
opportunity for attaining greatest performance, but this performance is
not portable across the different processor architectures.
Example 3-9 shows the Streaming SIMD Extensions inlined assembly
encoding.
Example 3-9
Streaming SIMD Extensions Using Inlined Assembly Encoding
void add(float *a, float *b, float *c)
{
__asm {
mov
mov
mov
movaps
addps
movaps
}
}
Intrinsics
Intrinsics provide the access to the ISA functionality using C/C++ style
coding instead of assembly language. Intel has defined three sets of
intrinsic functions that are implemented in the Intel
support the MMX technology, Streaming SIMD Extensions and
Streaming SIMD Extensions 2. Four new C data types, representing
64-bit and 128-bit objects are used as the operands of these intrinsic
functions.
single-precision floating-point SIMD,
eax, a
edx, b
ecx, c
xmm0, XMMWORD PTR [eax]
xmm0, XMMWORD PTR [edx]
XMMWORD PTR [ecx], xmm0
is used for MMX integer SIMD,
__m64
Coding for SIMD Architectures
®
C++ Compiler to
__m128
is used for Streaming
__m128i
3
is used for
3-15
Need help?
Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?