Timing Considerations; General Instruction Flow - IBM PowerPC 604 User Manual

Risc
Table of Contents

Advertisement

6.4 Timing Considerations
A superscalar machine is one that can issue multiple instructions concurrently from a
conventional linear instruction stream. The 604 is a true superscalar implementation of the
PowerPC architecture since a maximum of four instructions can
be
issued to the execution
units during . each clock cycle. Although a superscalar implementation complicates
instruction timing, these complications are transparent to the functionality of software.
While the 604 appears to the programmer to execute instructions in sequential order, the
604 provides increased performance by executing multiple instructions at a time, and by
using hardware to manage dependencies.
When an instruction is issued, the register file places the appropriate source data on the
appropriate source bus. The corresponding execution unit then reads the data from the bus.
The register files and source buses have sufficient bandwidth to allow the dispatching of
four instructions per clock.
If
an operand is unavailable, the instruction is kept in a
reservation station until the operand becomes available.
The 604 contains the following execution units that operate independently and in parallel:
• Branch processing unit (BPU)
• Two 32-bit single-cycle integer units (SCIU)
• One 32-bit multiple-cycle integer units (MCIU)
• 64-bit floating-point unit (FPU)
• Load/store unit (LSU)
As shown in Figure 1-1, the BPU directs the program flow with the aid of a dynamic branch
prediction mechanism. The instruction unit determines to which of the five other execution
units an instruction is dispatched.
6.4.1 General Instruction Flow
When the IU or FPU finishes executing an instruction, it places the resulting data, if any,
into one of the GPR, FPR, or condition register rename registers. The results are then stored
into the correct register file during the write-back stage.
If
a subsequent instruction is
waiting for this data, it is forwarded from the result buses, directly into the appropriate
execution unit for the immediate execution of the waiting instruction. This allows a data-
dependent instruction to be executed without waiting for the data to be written into the
register file and then read back out again. This feature, known as feed forwarding,
significantly shortens the time the machine may stall on data dependencies.
Chapter 6. Instruction Timing
6·17

Advertisement

Table of Contents
loading

Table of Contents