IA-32 Intel® Architecture Optimization
Data Deswizzling
In the deswizzle operation, we want to arrange the SoA format back into
AoS format so the
memory as
instructions to regenerate the
into its corresponding memory location using
by another
Example 5-5 illustrates the deswizzle function:
Example 5-5
Deswizzling Single-Precision SIMD Data
void deswizzle_asm(Vertex_soa *in, Vertex_aos *out)
{
__asm {
mov
ecx, in
mov
edx, out
movaps
xmm7, [ecx]
movaps
xmm6, [ecx+16]
movaps
xmm5, [ecx+32]
movaps
xmm4, [ecx+48]
// START THE DESWIZZLING HERE
movaps
xmm0, xmm7
unpcklps xmm7, xmm6
movlps
[edx], xmm7
movhps
[edx+16], xmm7
unpckhps xmm0, xmm6
movlps
[edx+32], xmm0
movhps
[edx+48], xmm0
movaps
xmm0, xmm5
5-14
,
xxxx
yyyy
. To do this we can use the
xyz
/
to store the
movlps
movhps
// load structure addresses
// load x1 x2 x3 x4 => xmm7
// load y1 y2 y3 y4 => xmm6
// load z1 z2 z3 z4 => xmm5
// load w1 w2 w3 w4 => xmm4
// xmm0= x1 x2 x3 x4
// xmm7= x1 y1 x2 y2
// v1 = x1 y1 -- --
// v2 = x2 y2 -- --
// xmm0= x3 y3 x4 y4
// v3 = x3 y3 -- --
// v4 = x4 y4 -- --
// xmm0= z1 z2 z3 z4
,
are rearranged and stored in
zzzz
unpcklps
layout and then store each half (
xyxy
movlps
component.
z
/
unpckhps
xy
/
followed
movhps
continued
)
Need help?
Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?