Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 758

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

To support existing IA-32 atomic read-modify-write operations that require the LOCK
pin, an Itanium architecture-based operating system can use the DCR.lc bit to intercept
all external IA-32 read-modify-write operations. Then, the IA_32_Intercept(Lock)
handler can emulate these operations by first acquiring a cacheable virtualized LOCK
variable, then performing the required memory operations non-atomically, and then
releasing the virtualized LOCK variable. This emulation allows the read-modify-write
sequence to appear atomic to other processors that use the semaphore.
2.1.4
Memory Fences
The memory fence instruction (mf) is the only instruction in the Itanium instruction set
with fence semantics. This instruction serializes the set of memory accesses before the
memory fence in program order with respect to the set of memory accesses that follow
the fence in program order.
2.2
Memory Ordering in the Intel
Architecture
Understanding a system's memory ordering model is key to writing either user- or
system-level multiprocessor software that uses shared memory to communicate
between processes and also that executes correctly on a shared-memory
multiprocessor system. For a general introduction to memory ordering models, see
Adve and Gharachorloo [AG95].
Four factors determine how a processor or system based on the Itanium architecture
orders a group of memory operations with respect to each other:
• Data dependencies define the relationship between operations from the same
processor that have register or memory dependencies on the same address
relationship need only be honored by the local processor (i.e. the processor that
executes the operations).
• The memory ordering semantics define the relationship between memory
operations from a particular processor that reference different addresses. For
cacheable references, this relationship is honored by all observers in the coherence
domain.
• Aligned release stores and semaphore operations (both require and release forms)
become visible to all observers in the coherence domain in a single total order
except each processor may observe its own release stores (via loads or acquire
loads) prior to their being observed globally
• Non-programmer-visible state, such as store buffers, processor caches, or any
logically-equivalent structure, may satisfy read requests from loads or acquire loads
on the local processor before the data in the structure is made globally visible to
other observers.
1.
That is, A precedes B in program order and A produces a value that B consumes. This relationship is
transitive.
2.
Consequently, each such operation appears to become visible to each observer in the coherence
domain at the same time, with the exception that a release store can become visible to the storing
processor before others.
2:510
®
®
Itanium
2
.
Volume 2, Part 2: MP Coherence and Synchronization
1
. This

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents