Z13S In-Order And Out-Of-Order Core Execution Improvements - IBM z13s Technical Manual

Table of Contents

Advertisement

Figure 3-10 shows how OOO core execution can reduce the run time of a program.
Instructions
1
2
3
4
5
6
7
Figure 3-10 z13s in-order and out-of-order core execution improvements
The left side of Figure 3-10 shows an in-order core execution. Instruction 2 has a large delay
because of an L1 cache miss, and the next instructions wait until Instruction 2 finishes. In the
usual in-order execution, the next instruction waits until the previous instruction finishes.
Using OOO core execution, which is shown on the right side of Figure 3-10, Instruction 4 can
start its storage access and run while instruction 2 is waiting for data. This situation occurs
only if no dependencies exist between the two instructions. When the L1 cache miss is
solved, Instruction 2 can also start its run while Instruction 4 is running. Instruction 5 might
need the same storage data that is required by Instruction 4. As soon as this data is on L1
cache, Instruction 5 starts running at the same time. The z13 superscalar PU core can have
up to 10 instructions/operations running per cycle. This technology results in a shorter run
time.
Branch prediction
If the branch prediction logic of the microprocessor makes the wrong prediction, removing all
instructions in the parallel pipelines might be necessary. The wrong branch prediction is
expensive in a high-frequency processor design. Therefore, the branch prediction techniques
that are used are important to prevent as many wrong branches as possible.
For this reason, various history-based branch prediction mechanisms are used, as shown on
the in-order part of the z13s PU core logical diagram in Figure 3-9 on page 93. The branch
target buffer (BTB) runs ahead of instruction cache pre-fetches to prevent branch misses in
an early stage. Furthermore, a branch history table (BHT) in combination with a pattern
history table (PHT) and the use of tagged multi-target prediction technology branch prediction
offer a high branch prediction success rate.
The z13s microprocessor improves the branch prediction throughput by using the new branch
prediction and instruction fetch front end.
94
IBM z13s Technical Guide
In-order core execution
L1 miss
Time
Dependency
Execution
Storage access
Out-of-order core execution
Instructions
Faster
1
millicode
2
L1 miss
execution
3
4
5
6
7
Time
Better
Instruction
Delivery

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents