Data Alignment For Mmx Technology; Example 3-13 C Algorithm For 64-Bit Data Alignment - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

Functions that use Streaming SIMD Extensions or Streaming SIMD
Extensions 2 data need to provide a 16-byte aligned stack frame.
The
__m128*
possibly creating "holes" (due to padding) in the argument block.
These new conventions presented in this section as implemented by the
Intel C++ Compiler can be used as a guideline for an assembly language
code as well. In many cases, this section assumes the use of the
data types, as defined by the Intel C++ Compiler, which represents an
array of four 32-bit floats.
For more details on the stack alignment for Streaming SIMD Extensions
and SSE2, see Appendix D, "Stack Alignment."

Data Alignment for MMX Technology

Many compilers enable alignment of variables using controls. This
aligns the variables' bit lengths to the appropriate boundaries. If some of
the variables are not appropriately aligned as specified, you can align
them using the C algorithm shown in Example 3-13.

Example 3-13 C Algorithm for 64-bit Data Alignment

/* Make newp a pointer to a 64-bit aligned array */
/* of NUM_ELEMENTS 64-bit elements. */
double *p, *newp;
p = (double*)malloc (sizeof(double)*(NUM_ELEMENTS+1));
newp = (p+7) & (~0x7);
The algorithm in Example 3-13 aligns an array of 64-bit elements on a
64-bit boundary. The constant of 7 is derived from one less than the
number of bytes in a 64-bit element, or 8-1. Aligning data in this manner
avoids the significant performance penalties that can occur when an
access crosses a cache line boundary.
parameters need to be aligned to 16-byte boundaries,
Coding for SIMD Architectures
3
__m128*
3-23

Advertisement

Table of Contents
loading

Table of Contents