About Cycle Timings And Interlock Behavior - ARM ARM1176JZF-S Technical Reference Manual

Table of Contents

Advertisement

16.1

About cycle timings and interlock behavior

16.1.1
Changes in instruction flow overview
ARM DDI 0301H
ID012310
Complex instruction dependencies and memory system interactions make it impossible to
describe briefly the exact cycle timing behavior for all instructions in all circumstances. The
timings that this chapter describes are accurate in most cases. If precise timings are required you
must use a cycle-accurate model of the processor.
Unless otherwise stated, cycle counts and result latencies that this chapter describes are best case
numbers. They assume:
no outstanding data dependencies between the current instruction and a previous
instruction
the instruction does not encounter any resource conflicts
all data accesses hit in the MicroTLB and Data Cache, and do not cross protection region
boundaries
all instruction accesses hit in the Instruction Cache.
This section describes:
Changes in instruction flow overview
Instruction execution overview on page 16-3
Conditional instructions on page 16-4
Opposite condition code checks on page 16-4
Definition of terms on page 16-5.
To minimize the number of cycles, because of changes in instruction flow, the processor
includes a:
dynamic branch predictor
static branch predictor
return stack.
The dynamic branch predictor is a 128-entry direct-mapped branch predictor using VA bits
[9:3]. The prediction scheme uses a two-bit saturating counter for predictions that are:
Strongly Not Taken
Weakly Not Taken
Weakly Taken
Strongly Taken.
Only branches with a constant offset are predicted. Branches with a register-based offset are not
predicted. A dynamically predicted branch can be folded out of the instruction stream if the
following instruction arrives while the branch is within the prefetch instruction buffer. A
dynamically predicted branch takes one cycle or zero cycles if folded out.
The static branch predictor operates on branches with a constant offset that are not predicted by
the dynamic branch predictor. Static predictions are issued from the Iss stage of the main
pipeline, consequently a statically predicted branch takes four cycles.
The return stack consists of three entries, and as with static predictions, issues a prediction from
the Iss stage of the main pipeline. The return stack mispredicts if the value taken from the return
stack is not the value that is returned by the instruction. Only unconditional returns are
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Cycle Timings and Interlock Behavior
16-2

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents