IBM Power7 Optimization And Tuning Manual page 64

Table of Contents

Advertisement

Bits 61:63 – DPFD – Default Prefetch Depth
Supplies a prefetch depth for hardware-detected streams and for software-defined
streams for which a depth of zero is specified, or for which dcbt or dcbtst with TH=1010 is
not
used in their description.
Bits 55:57 - URG - Depth Attainment Urgency
This field is a new one added in the POWER7+ processor. This field indicates how quickly
the prefetch depth should be reached for hardware-detected streams. Values and their
meanings are as follows:
– 0: Default
– 1: Not urgent
– 2: Least urgent
– 3: Less urgent
– 4: Medium
– 5: Urgent
– 6: More urgent
– 7: Most urgent
The ability to enable or disable the three types of streams that the hardware can detect (load
streams, store streams, or stride-N streams), or to set the default prefetch depth, allows
empirical testing of any application. There are no simple rules for determining which settings
are optimum overall for a application: the performance of prefetching depends on many
different characteristics of the application in addition to the characteristics of the specific
system and its configuration. Data prefetches are purely speculative, meaning they can
improve performance greatly when the data that is prefetched is, in fact, referenced by the
application later, but can also degrade performance by expending bandwidth on cache lines
that are not later referenced, or by displacing cache lines that are later referenced by
the program.
Similarly, setting DPFD to a deeper depth tends to improve performance for data streams that
are predominately sourced from memory because the longer the latency to overcome, the
deeper the prefetching must be to maximize performance. But deeper prefetching also
increases the possibility of stream overshoot, that is, prefetching lines beyond the end of the
stream that are not later referenced. Prefetching in multi-core processor implementations has
implications for other threads or processes that are sharing cache (in SMT mode) or the same
system bandwidth.
Controlling DSCR under Linux
DSCR settings on Linux are controlled with the ppc64_cpu command. Controlling DSCR
settings for an application is generally considered advanced and specific tuning.
Currently, setting the DSCR value is a cross-LPAR setting.
Controlling DSCR under AIX
Under AIX, DSCR settings can be controlled both by programming API and from the
command line by running the following commands:
dscr_ctl() API
#include <sys/machine.h>
int dscr_ctl(int op, void *buf_p, int size)
62
dscr_ctl subroutine, available at:
http://pic.dhe.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.basetechref/doc/basetrf1/dscr_ctl.htm
63
dscrctl command, available at:
http://pic.dhe.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.cmds/doc/aixcmds2/dscrctl.htm
48
POWER7 and POWER7+ Optimization and Tuning Guide
62,63

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents