Multiply/Multiply Accumulate (Mac) Pipeline; Basic Optimizations; Conditional Instructions - Intel IXP45X Developer's Manual

Network processors
Table of Contents

Advertisement

®
Intel XScale
Processor—Intel
3.10.2.5

Multiply/Multiply Accumulate (MAC) Pipeline

The Multiply-Accumulate (MAC) unit executes the multiply and multiply-accumulate
instructions supported by the IXP45X/IXP46X network processors. The MAC
implements the 40-bit accumulator register (acc0) on the IXP45X/IXP46X network
processors and handles the instructions, which transfer its value to and from general-
purpose Intel StrongARM registers.
The following are important characteristics about the MAC:
• The MAC is not truly pipelined, as the processing of a single instruction may require
use of the same data path resources for several cycles before a new instruction can
be accepted. The type of instruction and source arguments determines the number
of cycles required.
• No more than two instructions can occupy the MAC pipeline concurrently.
• When the MAC is processing an instruction, another instruction may not enter M1
unless the original instruction completes in the next cycle.
• The MAC unit can operate on 16-bit packed signed data. This reduces register
pressure and memory traffic size. Two 16-bit data items can be loaded into a
register with one LDR.
• The MAC can achieve throughput of one multiply per cycle when performing a 16-
by-32-bit multiply.
3.10.2.5.1
Behavioral Description
The execution of the MAC unit starts at the beginning of the M1 pipe stage, where it
receives two 32-bit source operands. Results are completed N cycles later (where N is
dependent on the operand size) and returned to the register file. For more information
on MAC instruction latencies, refer to
An instruction that occupies the M1 or M2 pipe stages will also occupy the X1 and X2
pipe stage, respectively. Each cycle, a MAC operation progresses for M1 to M5. A MAC
operation may complete anywhere from M2-M5. If a MAC operation enters M3-M5, it is
considered committed because it will modify architectural state regardless of
subsequent events.
3.10.3

Basic Optimizations

This section outlines optimizations specific to Intel StrongARM architecture. These
optimizations have been modified to suit the IXP45X/IXP46X network processors where
needed.
3.10.3.1

Conditional Instructions

The IXP45X/IXP46X network processors' architecture provides the ability to execute
instructions conditionally. This feature combined with the ability of the IXP45X/IXP46X
network processors instructions to modify the condition codes makes possible a wide
array of optimizations.
3.10.3.1.1
Optimizing Condition Checks
The IXP45X/IXP46X network processors' instructions can selectively modify the state of
the condition codes. When generating code for if-else and loop conditions it is often
beneficial to make use of this feature to set condition codes, thereby eliminating the
need for a subsequent compare instruction.
Consider the C code segment:
August 2006
Order Number: 306262-004US
®
®
IXP45X and Intel
IXP46X Product Line of Network Processors
Intel
"Instruction Latencies" on page
®
®
IXP45X and Intel
IXP46X Product Line of Network Processors
182.
Developer's Manual
195

Advertisement

Table of Contents
loading

This manual is also suitable for:

Ixp46x

Table of Contents