Write Buffer; Multiply-Accumulate Coprocessor; Performance Monitoring Unit - Intel IXP45X Developer's Manual

Network processors
Table of Contents

Advertisement

Intel
2.2.9

Write Buffer

The write buffer (WB) holds data for storage to memory until the bus controller can act
on it. The WB is eight entries deep, where each entry holds 16 bytes. The WB is
constantly enabled and accepts data from the Intel XScale processor, D-cache, or mini-
data cache.
Coprocessor 15, Register 1 specifies whether WB coalescing is enabled or disabled.
When coalescing is disabled, stores to memory occur in program order regardless of
the attribute bits within the descriptors located in the DTLB.
When coalescing is enabled, the attribute bits within the descriptors located in the
DTLB are examined to determine when coalescing is enabled for the destination region
of memory. When coalescing is enabled in both CP15, R1 and the DTLB, data entering
the WB can coalesce with any of the eight entries (16 bytes) and be stored to the
destination memory region, but possibly out of program order.
Stores to a memory region specified to be non-cacheable and non-bufferable by the
attribute bits within the descriptors located in the DTLB causes the Intel XScale
processor to stall until the store completes. A coprocessor register can specify draining
of the write buffer.
2.2.10

Multiply-Accumulate Coprocessor

For efficient processing of high-quality, media-and-signal-processing algorithms, the
Multiply-Accumulate Coprocessor (CP0) provides 40-bit accumulation of 16 x 16, dual-
16 x 16 (SIMD), and 32 x 32 signed multiplies. Special MAR and MRA instructions are
implemented to move the 40-bit accumulator to two Intel XScale processor-general
registers (MAR) and move two Intel XScale processor-general registers to the 40-bit
accumulator (MRA). The 40-bit accumulator can be stored or loaded to or from D-
cache, mini-data cache, or memory using two STC or LDC instructions.
The 16 x 16 signed multiply-accumulates (MIAxy) multiply either the high/high, low/
low, high/low, or low/high 16 bits of a 32-bit core general register (multiplier) and
another 32-bit core general register (multiplicand) to produce a full, 32-bit product that
is sign-extended to 40 bits and added to the 40-bit accumulator.
Dual-signed, 16 x 16 (SIMD) multiply-accumulates (MIAPH) multiply the high/high and
low/low 16-bits of a packed 32-bit, core-general register (multiplier) and another
packed 32-bit, core-general register (multiplicand) to produce two 16-bits products
that are both sign-extended to 40 bits and added to the 40-bit accumulator.
The 32 x 32 signed multiply-accumulates (MIA) multiply a 32-bit, core-general register
(multiplier) and another 32-bit, core-general register (multiplicand) to produce a 64-bit
product where the 40 LSBs are added to the 40-bit accumulator. The 16 x 32 versions
of the 32 x 32 multiply-accumulate instructions complete in a single cycle.
2.2.11

Performance Monitoring Unit

The performance monitoring unit (PMU) contains four 32-bit, event counters and one
32-bit, clock counter. The event counters can be programmed to monitor I-cache hit
rate, data caches hit rate, ITLB hit rate, DTLB hit rate, pipeline stalls, BTB prediction hit
rate, and instruction execution count.
®
®
Intel
IXP45X and Intel
IXP46X Product Line of Network Processors
Developer's Manual
66
®
®
IXP45X and Intel
IXP46X Product Line of Network Processors—Functional Overview
August 2006
Reference Number: 306262-004US

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the IXP45X and is the answer not in the manual?

This manual is also suitable for:

Ixp46x

Table of Contents