The Clflush Instruction - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

The clflush Instruction

The cache line associated with the linear address specified by the value
of byte address is invalidated from all levels of the processor cache
hierarchy (data and instruction). The invalidation is broadcast
throughout the coherence domain. If, at any level of the cache hierarchy,
the line is inconsistent with memory (dirty) it is written to memory
before invalidation. Other characteristics include:
The data size affected is the cache coherency size, which is 64 bytes
on Pentium 4 processor.
The memory attribute of the page containing the affected line has no
effect on the behavior of this instruction.
The
clflush
subject to all permission checking and faults associated with a byte
load.
clflush
including other
memory fence for cases where ordering is a concern.
As an example, consider a video usage model, wherein a video capture
device is using non-coherent AGP accesses to write a capture stream
directly to system memory. Since these non-coherent writes are not
broadcast on the processor bus, they will not flush any copies of the
same locations that reside in the processor caches. As a result, before the
processor re-reads the capture buffer, it should use
that any stale copies of the capture buffer are flushed from the processor
caches. Due to speculative reads that may be generated by the processor,
it is important to observe appropriate fencing, using
Example 6-1 illustrates the pseudo-code for the recommended usage of
.
cflush
instruction can be used at all privilege levels and is
is an unordered operation with respect to other memory traffic
instructions. Software should use a
clflush
Optimizing Cache Usage
mfence
to ensure
clflush
.
mfence
6
,
6-17

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Table of Contents

Save PDF