Delay Slot And Functional Unit Latency Summary - Texas Instruments TMS320C6000 Series Reference Manual

Table of Contents

Advertisement

4.4 Delay Slots
Table 4–9. Delay Slot and Functional Unit Latency Summary
Instruction Type
Single cycle
2-cycle DP
4-cycle
INTDP
Load
DP compare
ADDDP/SUBDP
MPYI
MPYID
MPYDP
† Cycle i is in the E1 pipeline phase.
‡ A write on cycle i + 4 uses a separate write port from other instructions on the .D unit.
The execution of floating-point instructions can be defined in terms of delay
slots and functional unit latency. The number of delay slots is equivalent to the
number of additional cycles required after the source operands are read for the
result to be available for reading. For a single-cycle type instruction, operands
are read on cycle i and produce a result that can be read on cycle i + 1 . For
a 4-cycle instruction, operands are read on cycle i and produce a result that
can be read on cycle i + 4 . Table 4–9 shows the number of delay slots associat-
ed with each type of instruction.
The double-precision floating-point addition, subtraction, multiplication,
compare, and the 32-bit integer multiply instructions also have a functional unit
latency that is greater than 1. The functional unit latency is equivalent to the
number of cycles that the instruction uses the functional unit read ports. For
example, the ADDDP instruction has a functional unit latency of 2. Operands
are read on cycle i and cycle i + 1 . Therefore, a new instruction cannot begin
until cycle i + 2 , rather than i + 1 . ADDDP produces a result that can be read
on cycle i + 7 , because it has six delay slots.
Delay slots are equivalent to an execution or result latency. All of the instruc-
tions that are common to the 'C62x and 'C67x have a functional unit latency
of 1. This means that a new instruction can be started on the functional unit
each cycle. Single-cycle throughput is another term for single-cycle functional
unit latency.
Delay
Functional
Slots
Unit Latency
0
1
1
1
3
1
4
1
4
1
1
2
6
2
8
4
9
4
9
4
TMS320C67x Floating-Point Instruction Set
Delay Slots
Read Cycles
Cycles
i
i
i
i
i + 3, i + 4
i
i, i + 4
i, i + 1
i, i + 1
i + 5, i + 6
i, i + 1, 1 + 2, i + 3
i, i + 1, 1 + 2, i + 3
i + 8, i + 9
i, i + 1, 1 + 2, i + 3
i + 8, i + 9
Write
i
i, i + 1
i + 3
1 + 1
i + 8
4-11

Advertisement

Table of Contents
loading

This manual is also suitable for:

Tms320c67 seriesTms320c62 series

Table of Contents