Parallel Programming Options On The Intel® Xeon Phi™ Coprocessor; Parallel Programming On The Intel® Xeon Phi™ Coprocessor: Openmp - Intel Xeon Phi Developer's Quick Start Manual

Coprocessor
Table of Contents

Advertisement

scp /opt/intel/composerxe/lib/mic/libiomp5.so mic0:/tmp/libiomp5.so
Connect to the coprocessor with ssh and export the local directory so that the application can find any
5.
shared libraries it uses (in this case the OpenMP* runtime library):
ssh mic0
export LD_LIBRARY_PATH=/tmp
6. This application may generate a segmentation fault if the stacksize is not set correctly. To modify the
stacksize use:
ulimit –s unlimited
Go to /tmp and run a.out:
7.
cd /tmp
./a.out
Parallel Programming Options on the Intel® Xeon Phi™ Coprocessor
Most of the parallel programming options available on the host systems are available for the Intel® Xeon Phi™
Coprocessor. These include the following:
1. Intel Threading Building Blocks (Intel® TBB)
2. OpenMP*
3. Intel® Cilk Plus
4. pthreads*
The following sections will discuss the use of these parallel programming models in code using the offload
extensions. Code that runs natively on the Intel® Xeon Phi™ Coprocessor can use these parallel programming
models just as they would on the host, with no unusual complications beyond the larger number of threads.
Parallel Programming on the Intel® Xeon Phi™ Coprocessor: OpenMP*
There is no correspondence between OpenMP threads on the host CPU and on the Intel® Xeon Phi™
Coprocessor. Because an OpenMP parallel region within an offload/pragma is offloaded as a unit, the offload
compiler creates a team of threads based on the available resources on Intel® Xeon Phi™ Coprocessor. Since
the entire OpenMP construct is executed on the Intel® Xeon Phi™ coprocessor, within the construct the usual
OpenMP* semantics of shared and private data apply.
Multiple host CPU threads can offload to the Intel® Xeon Phi™ coprocessor at any time. If a CPU thread
attempts to offload to the Intel® Xeon Phi™ Coprocessor and resources are not available on the coprocessor,
the code meant to be offloaded may be executed on the host. When a thread on the coprocessor reaches the
"omp parallel" directive, it creates a team of threads based on the resources available on the coprocessor. The
theoretical maximum number of hardware threads that can be created is 4 times the number of cores in your
Intel® Xeon Phi™ Coprocessor. The practical limit is four less than this (for offloaded code) because the first
core is reserved for the uOS and its services.
Intel® Xeon Phi™ Coprocessor D
'
Q
S
G
EVELOPER
S
UICK
TART
UIDE
22

Advertisement

Table of Contents
loading

Table of Contents