Appendix D. Instruction Execution Performance And Code Optimizations; A2 Pipeline Overview; Figure D-1. A2 Pipeline Structure - IBM A2 User Manual

Table of Contents

Advertisement

Appendix D. Instruction Execution Performance and Code Optimizations

The instruction timing information and code optimization guidelines provided in this appendix can help
compiler developers and application programmers produce high-performance code and accurately analyze
instruction execution performance. While this appendix does not comprehensively identify every micro-archi-
tectural characteristic that could have a potential impact on instruction execution time within the A2 core, it
does provide a high-level overview of basic instruction operation and pipeline performance. The information
provided is sufficient to analyze the performance of code sequences to a high degree of accuracy.
D.1 A2 Pipeline Overview
As described in Overview on page 45, the A2 core is an in-order processor core capable of issuing two
instructions from different threads per cycle: a single instruction to the fixed-point pipeline and a separate
instruction to the floating-point pipeline. Figure D-1 provides an illustration of the pipeline stages of the A2
core.

Figure D-1. A2 Pipeline Structure

iu0
FPRs
Version 1.3
October 23, 2012
4 threads
I$dir
I$
iu1
iu2
GPRs
D$dir
ERAT
rf1
ex1
ex2
ex1
ex2
rf1
IBuffer Dependency
iu4
iu5
iu4
iu5
iu3
iu4
iu5
iu4
iu5
Intstruction Unit (IU)
Completion
ex3
ex4
ex5
D$
Branch, Fixed Point, Load/Store (XU)
ex3
ex4
ex5
CR
Instruction Execution Performance and Code Optimizations
User's Manual
A2 Processor
Issue
iu6
iu6
rf0
iu6
iu6
ucode
ex6
rf1
To L2
ex6
rf1
Floating Point (FU)
Page 833 of 864

Advertisement

Table of Contents
loading

Table of Contents