ARM Cortex-M3 Technical Reference Manual page 359

R2p0
Hide thumbs Also See for Cortex-M3:
Table of Contents

Advertisement

Instruction type
Size
Combined Branch
16
Extended
16
Divide
32
Sleep
32
Barriers
16
Saturation
32
a. Branches take one cycle for instruction and then pipeline reload for target instruction. Non-taken branches are 1 cycle total.
Taken branches with an immediate are normally 1 cycle of pipeline reload (2 cycles total). Taken branches with register
operand are normally 2 cycles of pipeline reload (3 cycles total). Pipeline reload is longer when branching to unaligned 32-bit
instructions in addition to accesses to slower memory. A branch hint is emitted to the code bus that permits a slower system
to pre-load. This can reduce the branch target penalty for slower memory, but never less than shown here.
b. Generally, load-store instructions take two cycles for the first access and one cycle for each additional access. Stores with
immediate offsets take one cycle.
c. UMULL/SMULL/UMLAL/SMLAL use early termination depending on the size of source values. These are interruptible
(abandoned/restarted), with worst case latency of one cycle. MLAL versions take four to seven cycles and MULL versions
take three to five cycles. For MLAL, the signed version is one cycle longer than the unsigned.
d. IT instructions can be folded.
e. DIV timings depend on dividend and divisor. DIV is interruptible (abandoned/restarted), with worst case latency of one cycle.
When dividend and divisor are similar in size, divide terminates quickly. Minimum time is for cases of divisor larger than
dividend and divisor of zero. A divisor of zero returns zero (not a fault), although a debug trap is available to catch this case.
f. Sleep is one cycle for the instruction plus as many sleep cycles as appropriate. WFE only uses one cycle when event has
passed. WFI is normally more than one cycle unless an interrupt happens to pend exactly when entering WFI.
g. ISB takes one cycle (acts as branch). DMB and DSB take one cycle unless data is pending in the write buffer or LSU. If an
interrupt comes in during a barrier, it is abandoned/restarted.
ARM DDI 0337G
Unrestricted Access
Cycles count
a
1+P
d
0-1
e
2-12
f
1+W
g
1+B
1
Cycle count information:
P = pipeline reload
N = count of elements
Copyright © 2005-2008 ARM Limited. All rights reserved.
Table 18-1 Instruction timings (continued)
Description
CBZ.
IT and NOP (includes YIELD).
SDIV and UDIV. 32/32 divides both signed and unsigned
with 32-bit quotient result (no remainder, it can be derived
by subtraction). This earlies out when dividend and divisor
are close in size.
WFI, WFE, and SEV are in the class of hinted NOP
instructions that control sleep behavior.
ISB, DSB, and DMB are barrier instructions that ensure
certain actions have taken place before the next instruction
is executed.
SSAT and USAT perform saturation on a register. They
perform three tasks. They normalize the value using shift,
test for overflow from a selected bit position (the Q value)
and set the xPSR Q bit. Saturation refers to the largest
unsigned value or the largest/smallest signed value for the
size selected.
Non-Confidential
Instruction Timing
18-5

Advertisement

Table of Contents
loading

Table of Contents