Floating-Point Unit (Fpu); Load/Store Unit (Lsu) - IBM PowerPC 604 User Manual

Risc
Table of Contents

Advertisement

Each SCIU consists of three single-cycle subunits-a fast adder/comparator, a subunit for
logical operations, and a subunit for performing rotates, shifts, and count-leading-zero
operations. These subunits handle all one-cycle arithmetic instructions; only one subunit
can execute an instruction at a time.
The MCIU consists of a 32-bit integer multiplier/divider and supports early exit on
16- x 32-bit multiplication operations. The MCIU executes mfspr and mtspr instructions,
which are used to read and write special-purpose registers. The MCIU can execute an
mtspr or mfspr instruction at the same time that it executes a multiply or divide instruction.
These instructions are allowed to complete out-of-order.
Note that the load and store instructions that update their address base register (specified by
the rA operand) pass the update results on the MCIU's result bus. Otherwise, the MCIU's
result bus is dedicated to MCIU operations.
1.2.2.2 Floating-Point Unit (FPU)
The FPU, shown in Figure 1-1 and Figure 1-2, is a single-pass, double-precision execution
unit; that is, both single- and double-precision operations require only a single pass, with a
latency of three cycles.
As the decode/dispatch unit issues instructions to the FPU's two reservation stations, source
operand data may be accessed from the FPRs, the O.oating-point rename buffers, or the
result buses. Results in
turn
are written to the floating-point rename buffers and to the
reservation stations and are made available to subsequent instructions. Instructions are
executed from each reservation station in dispatch order.
1.2.2.3 Load/Store Unit (LSU)
The LSU, shown in Figure 1-1 and Figure 1-2, transfers data between the data cache and
the result buses, which route data to other execution units. The LSU supports the address
generation and handles any alignment for transfers to and from system memory. The LSU
also supports cache control instructions and load/store multiple/string instructions. As
noted above, load and store instructions that update the base address register pass their
results on the MCIU's result bus. This is the only exception to the dedicated use of result
buses.
The LSU includes a 32-bit adder dedicated for EA calculatioa Data alignment logic
manipulates data to support aligned or misaligned transfers with the data cache. The LSU's
load and store queues are used to buffer instructions that have been executed and are
waiting to be completed. The queues are used to monitor data dependencies generated by
data forwarding and out-of-order instruction execution ensuring a sequential model.
The LSU allows load operations to precede pending store operations and resolves any
dependencies incurred when a pending store is to the same address as the load If such a
dependency exists, the LSU delays the load operation until the correct data can be
forwarded. If only
the
low-order 12 bits of the EAs match, both addresses may be aliases
Chapter 1. Overview
1-11

Advertisement

Table of Contents
loading

Table of Contents