Example 5-3 Swizzling Data - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization
To gather data from 4 different memory locations on the fly, follow
steps:
1.
Identify the first half of the 128-bit memory location.
2.
Group the different halves together using the
form an
3.
From the 4 attached halves, get the
yyyy
The
zzzz
Example 5-3 illustrates the swizzle function.
Example 5-3
Swizzling Data
typedef struct _VERTEX_AOS {
float x, y, z, color;
} Vertex_aos;
typedef struct _VERTEX_SOA {
float x[4], float y[4], float z[4];
float color[4];
} Vertex_soa;
void swizzle_asm (Vertex_aos *in, Vertex_soa *out)
{
// in mem: x1y1z1w1-x2y2z2w2-x3y3z3w3-x4y4z4w4-
// SWIZZLE XYZW --> XXXX
asm {
mov
ecx, in
mov
edx, out
5-10
layout in two registers.
xyxy
by using another shuffle.
is derived the same way but only requires one shuffle.
by using one shuffle, the
xxxx
// AoS structure declaration
// SoA structure declaration
// get structure addresses
and
movlps
movhps
continued
to

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Table of Contents

Save PDF