Pipeline Architecture
MicroBlaze instruction execution is pipelined. For most instructions, each stage takes one clock
cycle to complete. Consequently, the number of clock cycles necessary for a specific instruction to
complete is equal to the number of pipeline stages, and one instruction is completed on every cycle.
A few instructions require multiple clock cycles in the execute stage to complete. This is achieved
by stalling the pipeline.
When executing from slower memory, instruction fetches may take multiple cycles. This additional
latency directly affects the efficiency of the pipeline. MicroBlaze implements an instruction prefetch
buffer that reduces the impact of such multi-cycle instruction memory latency. While the pipeline is
stalled by a multi-cycle instruction in the execution stage, the prefetch buffer continues to load
sequential instructions. When the pipeline resumes execution, the fetch stage can load new
instructions directly from the prefetch buffer instead of waiting for the instruction memory access to
complete. If instructions are modified during execution (e.g. with self-modifying code), the prefetch
buffer should be emptied before executing the modified instructions, to ensure that it does not
contain the old unmodified instructions. The recommended way to do this is using an MBAR
instruction, although it is also possible to use a synchronizing branch instruction, for example BRI 4.
Three Stage Pipeline
With C_AREA_OPTIMIZED set to 1, the pipeline is divided into three stages to minimize hardware
cost: Fetch, Decode, and Execute.
Five Stage Pipeline
With C_AREA_OPTIMIZED set to 0, the pipeline is divided into five stages to maximize
performance: Fetch (IF), Decode (OF), Execute (EX), Access Memory (MEM), and Writeback
(WB).
Branches
Normally the instructions in the fetch and decode stages (as well as prefetch buffer) are flushed
when executing a taken branch. The fetch pipeline stage is then reloaded with a new instruction from
the calculated branch address. A taken branch in MicroBlaze takes three clock cycles to execute,
two of which are required for refilling the pipeline. To reduce this latency overhead, MicroBlaze
supports branches with delay slots.
MicroBlaze Processor Reference Guide
UG081 (v14.7)
cycle1
instruction 1
Fetch
Decode
instruction 2
instruction 3
cycle1
cycle2
instruction 1
IF
OF
instruction 2
IF
instruction 3
www.xilinx.com
cycle2
cycle3
cycle4
Execute
Fetch
Decode
Execute
Fetch
Decode
cycle3 cycle4 cycle5 cycle6 cycle7 cycle8 cycle9
EX
MEM
WB
OF
EX
MEM
IF
OF
EX
Pipeline Architecture
cycle5
cycle6
cycle7
Execute
Execute
Stall
Stall
Execute
MEM
MEM
WB
Stall
Stall
MEM
WB
Send Feedback
51
Need help?
Do you have a question about the MicroBlaze and is the answer not in the manual?