Integer Divide Latency - Motorola MPC8240 User Manual

Integrated host processor with integrated pci
Table of Contents

Advertisement

• The fetch pipeline stage primarily involves retrieving instructions from the memory
system and determining the location of the next instruction fetch. Additionally, the
BPU decodes branches during the fetch stage and folds out branch instructions
before the dispatch stage if possible.
• The dispatch pipeline stage is responsible for decoding the instructions supplied by
the instruction fetch stage, and determining which of the instructions are eligible to
be dispatched in the current cycle. In addition, the source operands of the
instructions are read from the appropriate register file and dispatched with the
instruction to the execute pipeline stage. At the end of the dispatch pipeline stage,
the dispatched instructions and their operands are latched by the appropriate
execution unit.
• During the execute pipeline stage, each execution unit that has an executable
instruction executes the selected instruction (perhaps over multiple cycles), writes
the instruction's result into the appropriate rename register, and notifies the
completion stage that the instruction has finished execution. In the case of an internal
exception, the execution unit reports the exception to the completion/writeback
pipeline stage and discontinues instruction execution until the exception is handled.
The exception is not signaled until that instruction is the next to be completed.
Execution of most load/store instructions is also pipelined. The load/store unit has
two pipeline stages. The first stage is for effective address calculation and MMU
translation, and the second stage is for accessing the data in the cache.
• The complete/writeback pipeline stage maintains the correct architectural machine
state and transfers the contents of the rename registers to the GPRs and FPRs as
instructions are retired. If the completion logic detects an instruction causing an
exception, all following instructions are cancelled, their execution results in rename
registers are discarded, and instructions are fetched from the correct instruction
stream.
The processor core provides support for single-cycle store operations and provides an
adder/comparator in the SRU that allows the dispatch and execution of multiple integer add
and compare instructions on each cycle.
Performance of integer divide operations has been improved in the processor core.
Execution of a divide instruction takes half the cycles to execute than that described in the
MPC603e User's Manual. The new latency is reflected in Table 5-9.
Primary Opcode
31
31
Table 5-9. Integer Divide Latency
Extended Opcode
Mnemonic
459
divwu[o][.]
491
divw[o][.]
Chapter 5. PowerPC Processor Core
Instruction Timing
Form
Unit
xo
IU
xo
IU
Cycles
20
20
5-33

Advertisement

Table of Contents
loading

Table of Contents