Freescale Semiconductor MCF54455 Reference Manual page 70

Table of Contents

Advertisement

The V4 ColdFire core pipeline stages include the following:
Four-stage instruction fetch pipeline (IFP) (plus optional instruction buffer stage)
— Instruction address generation (IAG) — Calculates the next prefetch address
— Instruction fetch cycle 1 (IC1) — Prefetch on the processor's local bus
— Instruction fetch cycle 2 (IC2) — Completes prefetch on the processor's local bus
— Instruction early decode (IED) — Generates time-critical decode signals needed for the OEP
— Instruction buffer (IB) — Optional buffer stage minimizes fetch latency effects using FIFO
queue
Five-stage operand execution pipeline (OEP) with two optional processor bus write cycles
— Decode and select (DS/secDS) — Decodes and selects two sequential instructions and selects
operands for effective address calculation
— Operand address generation (OAG) — Generates the effective (logical) address
— Operand fetch cycle 1 (OC1) — Initiates memory operand fetch on the processor's local bus
— Operand fetch cycle 2 (OC2) — Completes memory operand fetch on the processor's local bus,
as well as immediate and/or register operand fetches
— Execute (EX) — Performs prescribed operations on previously fetched data operands
— Write data available (DA) — Makes data available for operand write operations only
— Store data (ST) — Updates memory element for operand write operations only
When the instruction buffer is empty, opcodes are loaded directly from the IED cycle into the operand
execution pipeline. If the buffer is not empty, the IFP stores the contents of the fetched instruction and its
early decode information in the IB until it is required by the OEP.
The five stage operand execution pipeline structure is a key factor in the performance of the Version 4
ColdFire design. The pipeline structure is termed a limited superscalar design because there are certain,
heavily-used instruction constructs that support multiple-instruction dispatch. In particular, folding two
consecutive instructions into a single pipeline issue effectively creates zero-cycle execution times for
certain instructions.
With the increased performance, the bandwidth needed to support operand references requires a split bus
(or Harvard architecture) where there are separate instruction and operand memory connections. These
connections may be accessed concurrently to double the amount of available bandwidth to the processor's
pipelines.
The resulting pipeline and local bus structure allow the V4 ColdFire core to deliver sustained high
performance across a variety of demanding embedded applications.
3.1.1.1
Change-of-Flow Acceleration
To maximize the performance of conditional branch instructions, the IFP implements a sophisticated
two-level acceleration mechanism. The first level is an 8-entry, direct-mapped branch cache with 2 bits for
indicating four prediction states (strongly or weakly; taken or not-taken) for each entry. The branch cache
also provides the association between instruction addresses and the corresponding target address. In the
event of a branch cache hit, if the branch is predicted as taken, the branch cache sources the target address
Freescale Semiconductor
ColdFire Core
3-4

Advertisement

Table of Contents
loading

Table of Contents