IBM Power7 Optimization And Tuning Manual page 137

Hide thumbs Also See for Power7:

page of 224

/ 224
Contents
Table of Contents
Bookmarks

Table of Contents

Architecture-specific optimizations

Here are some architecture-specific optimizations:

--machine tgt (-m tgt): FDPR optimizations include general optimizations that are based

on a high-level program representation as a control and data flow, in addition to peephole

optimizations, relying on different architecture features. Those optimizations can perform

better when tuned for specific platforms. The -m flag allows the user to specify the target

machine model when known in cases where the program is not intended for use on

multiple target platforms. The default target is POWER7.

--align-code code (-A code): Optimizing the alignment and the placement of the code is

crucial to the performance of the program. Correct alignment can improve instruction

fetching and dispatching. The alignment algorithm in FDPR uses different techniques that

are based on the target platform. Some techniques are generic for the Power Architecture,

and others are considered dispatch rules of the specific machine model. If code is 1 (the

default), FDPR applies a standard alignment algorithm that is adapted for the selected

target machine (see -m in the previous bullet point). If code is 2, FDPR applies a more

advanced version, using dispatch rules and other heuristics to decide how the program

code chunks are placed relatively to i-cache sectors, again based on the selected target. A

value of 0 disables the alignment algorithm.

Function optimization

FDPR includes a number of function level optimizations that are based on detailed data flow

analysis (DFA). With DFA, optimizations can determine the data that is contained in each

The function optimizations are:

--killed-regs (-kr): A register is considered killed at a point (in the function) if its value is

not used in any ensuing path. FDPR uses the Power ABI convention that defines which

registers are non-volatile (NV) across function calls. NV registers that are used inside a

function are saved in its prolog and restored in its epilog. The -kr optimization analyzes

called functions that are looking for save and restore instructions of killed NV registers. If

the register is killed at the calling site, then the save and restore instructions for this

might be alive when the function is called. When needed, the optimization might also

reassign (rename) registers at the calling side to ensure that an NV is indeed killed and

can be optimized.

--hco-reschedule (-hr): The optimization analyzes the flow through hot basic blocks and

looks for instructions that can be moved to dominating colder basic blocks (basic block b1

dominates b2 if all paths to b2 first go through b1). For example, an instruction that loads a

constant to a register is a candidate for such motion.

--simplify-early-exit factor (-see factor): Sometimes a function starts with an early

exit condition so that if the condition is met, the whole body of the function is ignored. If the

condition is commonly taken, it makes sense to avoid saving the registers in the prolog

and restoring them in the epilog. The -see optimization detects such a condition and

provides a reduced epilog that restores only registers modified by computing the

factor

condition. If

is 1, a more aggressive optimization is performed where the prolog is

also optimized.

Chapter 6. Compilers and optimization tools for C, C++, and Fortran

121

Table of Contents

Need help?

Do you have a question about the Power7 and is the answer not in the manual?

This manual is also suitable for:

Power7+

IBM Power7 Optimization And Tuning Manual page 137

Need help?

Related Manuals for IBM Power7

Related Products for IBM Power7

This manual is also suitable for:

Table of Contents