3.3 External Cache Subsystem
Intel-compatible processors support multiprocessing both on the processor bus and on a memory
bus, both with and without secondary cache units. Due to the high bandwidth demands of
multiprocessor systems, external caches are often employed to improve performance. The
existence and implementation details of external caches are not a part of this specification.
However, when external caches are used, they must conform to certain requirements with regard to
the following design issues:
Maintaining cache coherency—When one processor accesses data cached in another
processor's cache, it must not receive incorrect data. If it modifies data, all other processors
that access that data also must not receive stale data. External caches must maintain coherency
among themselves, and with the main memory, internal caches, and other bus master DMA
Cache flushing—The processor can generate special flush and write-back bus cycles that must
be used by external caches in a manner that maintains cache coherency. The actual responses
are implementation-specific and may vary from design to design. A program can initiate
hardware cache flushing by executing a WBINVD instruction. This instruction is only
guaranteed to flush the caches of the local processor. See Appendix B for system-wide
flushing mechanisms. Given that cache coherency is maintained by hardware, there is no need
for software to issue cache flush instructions under normal circumstances.
Reliable communication—All processors must be able to communicate with each other in a
way that eliminates interference when more than one processor accesses the same area in
memory simultaneously. The processor uses the LOCK# signal for this purpose. External
caches must ensure that all locked operations are visible to other processors.
Write ordering—In some circumstances, it is important that memory writes be observed
externally in precisely the same order as programmed. External write buffers must maintain
the write ordering of the processor.
To protect the integrity of certain critical memory operations, Intel-compatible processors provide
an output signal called LOCK#. For any given memory access, LOCK# is asserted once, but may
remain asserted for as many memory bus cycles as required to complete the memory operation. It
is the responsibility of the system hardware designers to use this signal to control memory accesses
A compliant system in multiprocessor mode must guarantee atomicity of locked-aligned memory
operations; however, the implementation is not specified in this specification. A compliant system
must lock at least the area of memory defined by the destination operand. A specific
implementation may lock a broader area—it may even lock the entire bus. Therefore, software
must consider this behavior.
To guarantee AT compatibility, locking of misaligned memory operations over other
AT-compatible buses in the compliant system must be strictly implemented in accordance with the
bus specifications. A compliant system may not be required to support the misaligned memory