Intel ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 Manual page 333

Hide thumbs Also See for ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3:
Table of Contents

Advertisement

accesses of different sizes but with overlapping memory references appear to complete
non-atomically. To ensure that a memory write is globally observed prior to a memory
read, software must place an explicit fence operation between the two operations.
Aligned st.rel and semaphore operations
write-back memory become visible to all observers in a single total order (i.e., in a
particular interleaving; if it becomes visible to any observer, then it is visible to all
observers), except that for st.rel each processor may observe (via ld or ld.acq) its
own update prior to it being observed globally.
The Itanium architecture ensures this single total order only for aligned st.rel and
semaphore operations to cacheable write-back memory. Other memory operations
from multiple processors are not required to become visible in any particular order,
unless they are constrained w.r.t. each other by the ordering rules defined in
Table
4-16.
Ordering of loads is further constrained by data dependency. That is, if one load reads a
value written by an earlier load by the same processor (either directly or transitively,
through either registers or memory), then the two loads become visible in program
order.
For example, when this sequence is executed on a processor:
st [a] = data
st.rel [b] = a
and a second processor executes this sequence:
ld x = [b]
ld y = [x]
if the second processor observes the store to [b], it will also observe the store to [a].
Also for example, when this sequence is executed on a processor:
st [a]
st.rel [b] = 'new'
and a second processor executes this sequence:
ld x = [b]
cmp.eq p1 = x, 'new'
(p1)
ld y = [a]
if the second processor observes the store to [b], it will also observe the store to [a].
And for example, when this sequence is executed on a processor:
st [a]
st.rel [b] = 'new'
and a second processor executes this sequence:
1.
Both acquire and release semaphore forms
2.
e.g. unordered stores, loads,
write-back cacheable.
Volume 2, Part 1: Addressing and Protection
1
, or memory operations to pages with attributes other than
ld.acq
from multiple processors to cacheable
2
2:85

Advertisement

Table of Contents
loading

This manual is also suitable for:

Itanium architecture 2.3

Table of Contents