Intel® Mkl Automatic Offload Model - Intel Xeon Phi Developer's Quick Start Manual

Coprocessor
Table of Contents

Advertisement

Step 4: Free the memory you copied to the card in step 2. The alloc_if(0) qualifier is used to reuse
the data on the card on entering the offload section, and the free_if(1) qualifier is used to free the
data on the card on exit.
#pragma offload target(mic:PHI_DEV) \
in(A:length(matrix_elements) alloc_if(0) free_if(1)) \
in(B:length(matrix_elements) alloc_if(0) free_if(1)) \
in(C:length(matrix_elements) alloc_if(0) free_if(1))
{
}
As with Intel® MKL on any platform, it is possible to limit the number of threads it uses by setting the number
of allowed OpenMP threads before executing the MKL function within the offloaded code.
#pragma offload target(mic:PHIDEV) \
in(transa, transb, N, alpha, beta) \
nocopy(A: alloc_if(0) free_if(0)) nocopy(B: alloc_if(0) free_if(0))
out(C:length(matrix_elements) alloc_if(0) free_if(0)) // output data
{
omp_set_num_threads(64); // set num threads in openmp
sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N,
&beta, C, &N);
}
Code Example 17: Controlling Threads on the Intel® Xeon Phi™ Coprocessor Using
Intel® MKL Automatic Offload Model
A few of the host Intel® MKL functions are Automatic Offload-aware--you call them as you normally would on
the host. However, if you have preceded the library call with a call to mkl_mic_enable(), Intel MKL will
automatically decide at runtime whether some or all of the work required to complete the call should be
divided between the host and the Intel® Xeon Phi™ Coprocessor. It bases this decision on problem size, the
load on both processors, and other metrics. Turn this functionality off with mkl_mic_disable().
Automatic Offload applies only to select host Intel MKL library calls made outside of code run on the Intel®
Xeon Phi™ Coprocessor via _Cilk_offload or #pragma offload. As a result, you should be careful to
minimize transferring the same data both in Automatic Offload calls and in code run on the coprocessor by
_Cilk_offload or #pragma offload. At present, there is no way to keep common data on the
coprocessor between automatic MKL offloads and explicit programmer-controlled offloads (via
_Cilk_offload or #pragma offload).
Intel® Xeon Phi™ Coprocessor D
Code Example 16: Set the Copied Memory Free
omp_set_num_threads()
'
Q
S
G
EVELOPER
S
UICK
TART
UIDE
28

Advertisement

Table of Contents
loading

Table of Contents