Aso And Dso Optimizations - IBM Power7 Optimization And Tuning Manual

Table of Contents

Advertisement

ASO and DSO optimizations: ASO cache and memory affinity optimizations are bundled
with all AIX editions: AIX Express, AIX Standard, and AIX Enterprise. DSO large page and
memory prefetch optimizations are available as a separately chargeable premium feature
or bundled with AIX Enterprise.

4.2.2 ASO and DSO optimizations

The ASO framework allows multiple optimizations to be managed. Two optimizations are
included with the framework. Two more optimizations are added with the DSO package.
Cache and memory affinity optimization
Power Systems are continually increasing in processing capacity in terms of the number of
cores and the number of SMT threads per core. The latest Power Systems support up to 256
cores and four SMT threads per core, which allows a single logical partition (LPAR) to have
1024 logical CPUs (hardware threads). This increase in processing units has led to a
hierarchy of affinity domains.
Each core forms the smallest affinity domain. Multiple cores in a chip, multiple chips in a
package, and the system itself form other higher affinity domains. System performance is
close to optimal when the data that is crossing between these domains is minimal. The need
for cross-domain interactions arises because of either:
Software threads in different affinity domains must communicate
Data being accessed that is in a memory bank that is not in the same affinity domain as
the one of the requesting software thread
Apart from the general eligibility requirements that are listed in "System requirements" on
page 90, a workload must also be multi-threaded to be considered for cache and memory
affinity optimization.
Cache affinity
ASO analyzes the cache access patterns that are based on information from the kernel and
the PMU to identify potential improvements in cache affinity by moving threads of workloads
closer together. If such a benefit is predicted, ASO uses proprietary algorithms to estimate
the optimal size of the affinity domain for the workload, and it uses kernel services (see
"Affinity APIs " on page 75) to restrict the workload to that domain. After acting, ASO
continues to monitor the workload to ensure that it performs as predicted. If the results are not
as expected, ASO reverses its actions immediately.
Memory affinity
After a workload is identified and optimized for cache affinity, ASO begins monitoring the
memory access patterns of the workload process private memory. If it is found that the
workload can benefit from moving process private memory closer to the current affinity
domain, then hot pages are identified and migrated closer using software instrumentation.
Single-threaded processes are not considered for this optimization, because their process
private data is already affinitized by the kernel when the thread is moved to a new affinity
domain. Also, in the current version, only workloads that fit within a single Scheduler
Resource Affinity Domain (SRAD, a chip/socket in POWER7) are considered.
86
POWER7 and POWER7+ Optimization and Tuning Guide

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents