IBM Power7 Optimization And Tuning Manual page 76

Table of Contents

Advertisement

For example, if an SPLPAR is given a CPU entitlement of 2.0 cores and four virtual
processors in an uncapped mode, then the hypervisor can dispatch the virtual processors to
four physical cores concurrently if there are free cores available in the system. The SPLPAR
uses unused cores and the applications can scale up to four cores. However, if the system
does not have free cores, then the hypervisor dispatches four virtual processors on two cores
so that the concurrency is limited to two cores. In this situation, each virtual processor is
dispatched for a reduced time slice as two cores are shared across four virtual processors.
This situation can impact performance, so AIX operating system processor folding support
might be able to reduce to number of virtual processors that are dispatched so that only two
or three virtual processors are dispatched across the two physical cores.
Virtual processor management: Processor folding
The AIX operating system monitors the usage of each virtual processor and aggregate usage
of an SPLPAR, and if the aggregate usage goes below 49%, AIX starts folding down the
virtual CPUs so that fewer virtual CPUs are dispatched. This action has the benefit of virtual
CPUs running longer before it is preempted, which helps improve performance. If a virtual
CPU gets a shorter dispatch time slice, then more workloads are cut into time slices on the
processor core, which can cause higher cache misses.
If the aggregate usage of an SPLPAR goes above 49%, AIX starts unfolding virtual CPUs so
that additional processor capacity can be given to the SPLPAR. Virtual processor
management dynamically adopts number of virtual processors to match the load on an
SPLPAR. This threshold (vpm_fold_threshold) of 49% represents the SMT thread usage
starting with AIX V6.1 TL6; before that version, vpm_fold_threshold (which was set to 70%)
represents the core utilization.
With a vpm_fold_threshold value of 49%, the primary thread of a core is used before
unfolding another virtual processor to consume another core from the shared pool on
POWER7 Systems. If free cores are available in the shared processor pool, then unfolding
another virtual processor results in the LPAR getting another core along with its associated
caches. Now the SPLPAR can run on two primary threads of two cores instead of two threads
(primary and secondary) on the same core. A workload that is running on two primary threads
of two cores can achieve higher performance if there is less sharing of data than the workload
that is running on primary and secondary threads of the same core. The AIX virtual processor
management default policy aims at using the primary thread of each virtual processor first;
therefore, it unfolds the next virtual processor without using the SMT threads of the first virtual
processor. After it unfolds all the virtual processors and consumes the primary thread of all
the virtual processors, it starts using the secondary and tertiary threads of the
virtual processors.
If the system is highly used and there are no free cycles in the shared pool, when all the
SPLPARs in the system try to get more cores by unfolding more virtual processors and use
only the primary of thread of each core, the hypervisor creates time slices on the physical
cores across multiple virtual processors. This action impacts the performance of all the
SPLPARs, as time slicing increases cache misses and context switch cost
However, this alternative policy of making each virtual processor use all of the four threads
(SMT4 mode) of a physical core can be achieved by changing the values of a number of
restricted tunables. Do not use this change in normal conditions, as most of the systems do
not consistently run at high usage. You decide if such a change is needed based on the
workloads and system usage levels. For example, critical database SPLPAR needs more
cores even in a highly contended situation to achieve the best performance; however, the
production and test SPLPARs can be sacrificed by running on fewer virtual processors and
using all the SMT4 threads of a core.
60
POWER7 and POWER7+ Optimization and Tuning Guide

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents