The Prefetch Instructions - Pentium 4 Processor Implementation; Prefetch And Load Instructions - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization
The Prefetch Instructions – Pentium 4 Processor
Implementation
Streaming SIMD Extensions include four flavors of
instructions, one non-temporal, and three temporal. They correspond to
two types of operations, temporal and non-temporal.
The non-temporal instruction is
prefetchnta
The temporal instructions are
prefetcht0
prefetcht1
prefetcht2

Prefetch and Load Instructions

The Pentium 4 processor has a decoupled execution and memory
architecture that allows instructions to be executed independently with
memory accesses if there are no data and resource dependencies.
Programs or compilers can use dummy load instructions to imitate
prefetch functionality, but preloading is not completely equivalent to
prefetch instructions. Prefetch instructions provide a greater
performance than preloading.
6-8
At the time of
NOTE.
found in a cache level that is closer to the processor
than the cache level specified by the instruction, no
data movement occurs.
Fetch the data into the second-level cache, minimizing
cache pollution.
Fetch the data into all cache levels, that is, to the
second-level cache for the Pentium 4 processor.
Identical to
prefetcht0
Identical to
prefetcht0
prefetch
, if the data is already
prefetch

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Table of Contents

Save PDF