IBM Power7 Optimization And Tuning Manual page 59

Table of Contents

Advertisement

Here are examples of vector initialization using initializer lists:
vector unsigned int v1 = {1};// initialize the first 4 bytes of v1 with 1
// and the remaining 12 bytes with zeros
vector unsigned int v2 = {1,2};// initialize the first 8 bytes of v2 with 1 and 2
// and the remaining 8 bytes with zeros
vector unsigned int v3 = {1,2,3,4};// equivalent to the vector literal
// (vector unsigned int) (1,2,3,4)
How to use vector capability in POWER7
When you target a POWER processor that supports VMX or VSX, you can request the
compiler to transform code into VMX or VSX instructions. These machine instructions can run
up to 16 operations in parallel. This transformation mostly applies to loops that iterate over
contiguous array data and perform calculations on each element. You can use the NOSIMD
directive to prevent the transformation of a particular loop:
Using a compiler: Compiler versions that recognize the POWER7 architecture are XL
C/C++ 11.1 and XLF Fortran 13.1 or recent versions of GCC, including the Advance
Toolchain, and the SLES 11SP1 or Red Hat RHEL6 GCC compilers:
– For C:
• xlc -qarch=pwr7 -qtune=pwr7 -O3 -qhot -qsimd
• gcc -mcpu=power7 -mtune=power7 -O3
– For Fortran
• xlf -qarch=pwr7 -qtune=pwr7 -O3 -qhot -qsimd
• gfortran -mcpu=power7 -mtune=power7 -O3
Using Engineering and Scientific Subroutine (ESSL) libraries with vectorization support:
– Select routines have vector analogs in the library
– Key FFT, BLAS routines
Vector capability support in AIX
A program can determine whether a system supports the vector extension by reading the
vmx_version field of the _system_configuration structure. If this field is non-zero, then the
system processor chips and operating system contain support for the vector extension. A
__power_vmx() macro is provided in /usr/include/sys/systemcfg.h for performing this test.
A value of 2 means that the processor chip is both VMX and VSX capable.
The AIX Application Binary Interface (ABI) is extended to support the addition of vector
register state and conventions. AIX supports the AltiVec programming interface specification.
A set of malloc subroutines (vec_malloc, vec_free, vec_realloc, and vec_calloc) is provided
by AIX that give 16-byte aligned allocations. Vector-enabled compilation, with _VEC_ implicitly
defined by the compiler, result in any calls to older mallocs and callocs being redirected to
their vector-safe counterparts, vec_malloc and vec_calloc. Non-vector code can also be
explicitly compiled to pick up these same malloc and calloc redirections by explicitly defining
__AIXVEC.
56
Ibid
56
Chapter 2. The POWER7 processor
43

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents