IBM Power7 Optimization And Tuning Manual page 46

Table of Contents

Advertisement

All of these caches are effectively shared. The L2 cache has a longer access latency than L1,
and L3 has a longer access latency than L2. Each chip also has memory controllers, allowing
direct access to a portion of the memory DIMMs in the system.
application thread to access data in cache or memory that is attached to a remote chip than
to access data in a local cache or memory. These types of characteristics are often referred to
as
affinity performance effects
effects" on page 14). In many cases, systems that are built around different processor models
have varying characteristics (for example, while L3 is supported, it might not be implemented
on some models).
Functionally, it does not matter which core in the system an application thread is running on,
or what memory the data it is accessing is on. However, this situation does affect the
performance of applications, because accessing a remote memory or cache takes more time
than accessing a local memory or cache.
the capability of modern systems to support massive scaling and the resulting possibility for
remote accesses to occur across a large processor interconnection complex.
The effect of these system properties can be observed by application threads, because they
often move, sometimes rather frequently, between processor cores. This situation can
happen for various reasons, such as a page fault or lock contention that results in the
application thread being preempted while it waits for a condition to be satisfied, and then
being resumed on a different core. Any application data that is in the cache local to the
original core is no longer in the local cache, because the application thread moved and a
remote cache access is required.
attempt to ensure that cache and memory affinity is retained, this movement does occur, and
can result in a loss in performance. For an introduction to the concepts of cache and memory
affinity, see "The POWER7 processor and affinity performance effects" on page 14.
The IBM POWER Hypervisor is responsible for:
Virtualization of processor cores and memory that is presented to the operating system
Ensuring that the affinity between the processor cores and memory an LPAR is using is
maintained as much as possible
However, it is important for application designers to consider affinity issues in the design of
applications, and to carefully assess the impact of application thread and data placement on
the cores and the memory that is assigned to the LPAR the application is running in.
Various techniques that are employed at the system level can alleviate the effect of cache
sharing. One example is to configure the LPAR so that the amount of memory that is
requested for the LPAR is satisfied by the memories that are locally available to processor
cores in the system (the memory DIMMs that are attached to the memory controllers for each
processor core). Here, it is more likely that the POWER Hypervisor is able to maintain affinity
between the processor cores and memory that is assigned to the partition,
improving performance
For more information about LPAR configuration and running the lssrad command to query
the affinity characteristics of a partition, see Chapter 3, "The POWER Hypervisor" on
page 55.
13
Ibid
14
Ibid
15
Ibid
16
Ibid
30
POWER7 and POWER7+ Optimization and Tuning Guide
(see "The POWER7 processor and affinity performance
14
This situation becomes even more imperative with
15
Although modern operating systems, such as AIX,
16
.
13
Thus, it takes longer for an

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents