Table of Contents

Advertisement

Instrumentation stack
The instrumentation is using the stack for saving registers by dynamically allocating space on
the stack at a default location below the current stack pointer. On AIX, this default is at offset
-10240, and on Linux it is -1800. In some cases, especially in multi-threaded applications
where the stack space is divided between the threads, following a deep calling sequence, the
application can be quite close to the end of the stack, which can cause the application to fail.
To allocate the instrumentation closer to the current stack pointer, use the -iso option:
$ fdprpro -a instr my_prog -iso -300

6.3.6 Optimization

The optimization step is performed by running the following command:
$ fdprpro -a opt in [-o out] -f prof [opts...]
If out is not specified, the output file is in.fdpr. No profile is provided by default. If none is
specified or if the profile is empty, the resulting output binary file is not optimized.
Code reordering
Global code reordering works in two phases: making chains and reordering the chains.
The initial chains are sequentially ordered basic blocks, with branch conditions inverted where
necessary, so that branches between the basic blocks are mostly not taken. This
configuration makes instruction prefetching more efficient. Chains are terminated when the
heat (that is, execution count) goes below a certain threshold relative to the initial heat.
The second phase orders chains by successively merging the more strongly linked two
chains, based on how frequent the calls between the chains are. Combining chains crosses
function boundaries. Thus, a function can be broken into multiple chunks in which different
pieces of different functions are placed closely if there is a high frequency of call, branch, and
return between them. This approach improves code locality and thus i-cache and page
table efficiency.
You use the following options for code reordering:
--reorder-code (-RC): This component is the hard-working component of the global code
reordering. Use --rcaf to determine the aggressiveness level:
– 0: no change
– 1: standard (default)
– 2: most aggressive.
Use --rcctf to lower the threshold for terminating chains. Use -pp to preserve function
integrity and -pc to preserve CSECT integrity (AIX only). These two options limit global
code reordering and might be requested for ease of debugging.
--branch-folding (-bf) and --branch-prediction (-bp): These options control important
parts of the code reordering process. The -bf folds branch to branch into a single branch.
The -bp sets the static branch prediction bit when taken or not taken statistics justify it.
Chapter 6. Compilers and optimization tools for C, C++, and Fortran
119

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power7+

Table of Contents