Example 5-11 Multiplication Of Two Pair Of Single-Precision Complex Number - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization
instructions to perform multiplications of single-precision complex
numbers. Example 5-12 demonstrates using SSE3 instructions to
perform division of complex numbers.

Example 5-11 Multiplication of Two Pair of Single-precision Complex Number

// Multiplication of
// a + i b can be stored as a data structure
movsldup xmm0, Src1; load real parts into the destination,
movaps
mulps
shufps
movshdup xmm2, Src1; load the imaginary parts into the
mulps
addsubps xmm0, xmm2; b1c1+a1d1, a1c1 -b1d1, b0c0+a0d0,
In both of these examples, the complex numbers are store in arrays of
structures. The MOVSLDUP, MOVSHDUP and the asymmetric
ADDSUBPS instructions allow performing complex arithmetics on two
pair of single-precision complex number simultaneously and without
any unnecessary swizzling between data elements. The coding
technique demonstrated in these two examples can be easily extended to
perform complex arithmetics on double-precision complex numbers. In
the case of double-precision complex arithmetics, multiplication or
divisions is done on one pair of complex numbers at a time.
5-24
; a1, a1, a0, a0
xmm1, src2; load the 2nd pair of complex
; i.e. d1, c1, d0, c0
xmm0, xmm1; temporary results, a1d1, a1c1, a0d0,
; a0c0
xmm1, xmm1, b1; reorder the real and imaginary
; parts, c1, d1, c0, d0
; destination, b1, b1, b0, b0
xmm2, xmm1; temporary results, b1c1, b1d1, b0c0,
; b0d0
; a0c0-b0d0
(ak + i bk ) * (ck + i dk )
values,

Advertisement

Table of Contents
loading

Table of Contents