Lightweight Tuning And Optimization Guidelines - IBM Power7 Optimization And Tuning Manual

Table of Contents

Advertisement

3. Deep performance optimization guidelines
Deep performance analysis covers performance tools and general strategies for
identifying and fixing application bottlenecks. This type of analysis requires more
familiarity with performance tooling and analysis techniques, sometimes requiring a
deeper understanding of the application internals, and often requiring a more dedicated
and lengthy effort. Often, a simpler analysis is all that is required to identify serious
bottlenecks in an application, but a more detailed investigation is required to perform an
exhaustive search for all of the opportunities for increasing performance.
Performance improvement: Consider this the last activity that is undertaken, with
simpler analysis steps, for a moderately serious performance effort. The more complex
iterative analysis is reserved for only the most performance critical applications.
This section provides only minimal background on the guidance provided. Detailed material
about these topics is incorporated in the chapters that follow and in the appendixes. The
following chapters and appendixes also cover many other performance topics that are not
addressed here.
Guidance for POWER7 and POWER7+: The guidance that is provided in this book
specifically applies to POWER7 and POWER7+ processor chips and systems. The
POWER7+ processor is a superset of the POWER7 processor, so all optimizations
described for POWER7 apply equally to POWER7+. The guidance that is provided also
generally applies to previous generations of POWER processor chips and systems,
including POWER5 and POWER6. When our guidance is not applicable to all generations
of Power Systems, we note that for you.

1.5.1 Lightweight tuning and optimization guidelines

This section covers building and performance testing applications on POWER7, and gives a
brief introduction to the most important simple performance tuning opportunities that are
identified for POWER7. More details about these and other opportunities are presented in the
later chapters of this guide.
Performance test beds and workloads
In performance work, when you are tuning and optimizing an application for a particular
processor, you must run and measure performance levels on that processor. Although there
are some characteristics that are shared among processor chips in the same family, each
generation of processor chip has unique performance characteristics. Optimizing code for
POWER7 requires that you set up a test bed on a POWER7 system.
The POWER7+ processor has higher clock speeds and larger caches than POWER7, and
applications should see higher performance on that new chip. Aside from those differences,
the performance characteristics of the two chips are the same. A performance effort
specifically targeting POWER7+ should use a POWER7+ system, but otherwise a POWER7
system can be used.
You want to see good performance across a range of newer systems, with a special emphasis
on optimizing for the latest design. For Power Systems, the previous POWER6 generation is
still commonly used. For this reason, it is best to have multiple test bed environments: a
POWER7 system for most optimization work and a POWER6 system for limited testing to
ensure that all tuning is beneficial on the previous generation of hardware.
6
POWER7 and POWER7+ Optimization and Tuning Guide

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents