2.2.3
Understanding Other Ordering Models: Sequential
Consistency and IA-32
To provide a point of reference, it is helpful to understand other memory ordering
models. These ordering models affect not only the programmer's view of the system,
but also the overall system performance and design. Processors with relaxed memory
ordering models may achieve higher performance than those with strict ordering
models.
The most intuitive memory ordering model is "sequential consistency" (SC) which
Lamport formally defines in [L79]. In sequential consistency, all processors see the
memory references from a given processor in program order, and, in addition, all
processors see the same system-wide interleaving of memory references from each
processor.
The SC model precludes many common optimizations made in modern microprocessors
to enhance performance. For example, in an SC system, a load may not pass a prior
store until that store becomes globally visible (because all memory operations must
become visible in program order). This requirement prevents the SC system from using
a store buffer to hide the latency of store traffic by allowing loads that hit the cache to
be serviced under a prior store that miss the cache.
To address such performance issues, many memory ordering models have been
developed that relax the constraints of sequential consistency. Adve categorizes these
memory models by noting how they relax the ordering requirements between reads
and writes and if they allow writes to be read early [AG95]. The Itanium architecture
allows for relaxed ordering between reads and writes and also allows writes to be read
early under certain circumstances.
Aside from disallowing any relaxation of memory references, sequential consistency has
two other subtle differences from the Itanium memory ordering model. First, it requires
a total order of operations whereas the Itanium memory ordering model only requires a
total order for release stores and semaphores. Second, remote processors must always
honor data dependencies since the local processor does not have the option of
re-ordering such accesses as can occur.
The IA-32 memory ordering relaxes write to read ordering and allows a processor to
read its own writes before they are globally visible. Further, IA-32 allows each
processor in the coherence domain to interleave the reference streams from other
processors in the coherence domain in a different order. The per-processor orders must
meet some additional constraints to ensure they are consistent with each other
(enumerating and explaining these constraints is beyond the scope of this document).
For more information on the IA-32 ordering model see
Segmentation" on page
Volume 2, Part 2: MP Coherence and Synchronization
1:131.
Section 6.2.3.2, "IA-32
2:525
Need help?
Do you have a question about the ITANIUM ARCHITECTURE - SOFTWARE DEVELOPERS MANUAL VOLUME 1 REV 2.3 and is the answer not in the manual?
Questions and answers