IBM Power7 Optimization And Tuning Manual page 191

Table of Contents

Advertisement

For example, the long double data type is supported for both Intel x86 and Power, but has a
different size, data range, and implementation. The x86 80-bit Floating Point format is
implemented in hardware and is usually faster than the AIX long double, which is
implemented as an algorithm using two, 64-bit doubles. Neither one is fully IEEE-compliant,
and both must be avoided in cross-platform application codes and libraries.
Another example is small Intel specific optimization using inline x86 assembler and
conditionally providing a generic C implementation for other platforms. In most cases, GCC
provides an equivalent built-in function that generates the optimal code for each platform.
Replacing inline assembler with GCC built-in functions makes the application more portable
and provides equivalent or better performance on all platforms.
To use the MA tool, complete the following steps:
1. Import your project into the SDK.
2. Select Project properties.
3. Check the Linux/x86 to PowerLinux application Migration check box under C/C++
General/Code Analysis.
Hotspot profiling
IBM SDK for PowerLinux integrates the Linux Oprofile hardware event profiling with the
application source code view. This configuration is a convenient way to do hotspot analysis.
The integrated Linux Tools profiler focuses on an application that is selected from the current
SDK project.
After you run the application, the SDK opens an Oprofile tab in console window. This window
twisties
event
shows a nested set of
, starting with the
(cycles by default), then
program/library
,
function
, and
source line
(within function). The developer drills-down by
opening the twisties in the profile window, opening the next level of detail. Items are ordered
by profile frequency with highest frequency first. Clicking the function or line number entries in
jump
the profile window causes the source view to
to the corresponding source file or
line number.
This process is a convenient way to do hotspot analysis, focusing only on the top three to five
items at each level in the profile. Examine the source code for algorithmic problems, excess
conversions, unneeded debug code, and so on, and make the appropriate source
code changes.
With your application code (or subset) imported in to the SDK, it is easy to edit, compile, and
profile code changes and verify improvements. As the developer makes code improvements,
the hotspots in the profile change. Repeat this process until performance is satisfactory or all
the profile entries at the function level are in the low single digits.
To use the integrated profiler, right-click the project and select Profile As  Profile with
Oprofile. If you project containers multiple applications or the application needs setup or
inputs to run the specific workload, then create Profile Configurations as needed.
Detailed analysis with the Source Code Advisor
Hotspot analysis might not find all of the latent performance problems, especially coding style
and some machine-specific hazards. These problems tend to be diffused across the
application, and do not show up in hotspot analysis. Common examples of machine hazards
include address translation, cache misses, and branch miss-predictions.
175
Appendix B. Performance tooling and empirical performance analysis

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents