Mp_Task_Affinity; Mp_Css_Interrupt; Mpi-Io - IBM pSeries Tuning Manual

High performance switch tuning and debug guide

Hide thumbs

Table Of Contents

Table of Contents

2.1.5 MP_TASK_AFFINITY

Setting MP_TASK_AFFINITY to SNI tells parallel operating environment (POE) to bind each

task to the MCM containing the HPS adapter it will use, so that the adapter, CPU, and memory

used by any task are all local to the same MCM. To prevent multiple tasks from sharing the same

CPU, do not set MP_TASK_AFFINITY to SNI if more than four tasks share any HPS adapter.

If more than four tasks share any HPS adapter, set MP_TASK_AFFINITY to MCM, which allows

each MPI task to use CPUs and memory from the same MCM, even if the adapter is on a remote

MCM. If MP_TASK_AFFINITY is set to either MCM or SNI, MEMORY_AFFINITY should be

set to MCM.

2.1.6 MP_CSS_INTERRUPT

The MP_CSS_INTERRUPT variable allows you to control interrupts triggered by packet arrivals.

Setting this variable to no implies that the application should run in polling mode. This setting is

appropriate for applications that have mostly synchronous communication. Even applications that

make heavy use of MPI_ISEND/MPI_IRECV should be considered synchronous unless there is

significant computation between the ISEND/IRECV postings and the MPI_WAITALL. The

default value for MP_CSS_INTERRUPT is no.

For applications with an asynchronous communication pattern (one that uses non-blocking MPI

calls), it might be more appropriate to set this variable to yes. Setting MP_CSS_INTERRUPT to

yes can cause your application to be interrupted when new packets arrive, which could be

helpful if a receiving MPI task is likely to be in the middle of a long numerical computation at the

time when data from a remote-blocking send arrives.

2.2 MPI-IO

The most effective use of MPI-IO is when an application takes advantage of file views and

collective operations to read or write a file in which data for each task is dispersed across the file.

To simplify we focus on read, but write is similar.

An example is reading a matrix with application-wide scope from a single file, with each task

needing a different fragment of that matrix. To bring in the fragment needed for each task,

several disjoint chunks must be read. If every task were to do POSIX read of each chunk, the

GPFS file system handle it correctly. However, because each read() is independent, there is little

chance to apply an effective strategy.

When the same set of reads is done with collective MPI-IO, every task specifies all the chunks it

needs to one MPI-IO call. Because the call is collective, the requirements of all the tasks are

known at one time. As a result, MPI can use a broad strategy for doing the I/O.

When MPI-IO is used but each call to read or write a file is local or specifies only a single chunk

of data, there is much less chance for MPI-IO to do anything more than a simple POSIX read()

would do. Also, when the file is organized by task rather than globally, there is less MPI-IO can

do to help. This is the case when each task's fragment of the matrix is stored contiguously in the

file rather than having the matrix organized as a whole.

pshpstuningguidewp040105.doc

Page 7

Table of Contents

This manual is also suitable for:

@server pseries

Mp_Task_Affinity; Mp_Css_Interrupt; Mpi-Io - IBM pSeries Tuning Manual

2.1.5 MP_TASK_AFFINITY

2.1.6 MP_CSS_INTERRUPT

2.2 MPI-IO

Related Manuals for IBM pSeries

Related Products for IBM pSeries

This manual is also suitable for:

Table of Contents