Take Advantage Of Write Combining; Avoid Placing Code And Data In The Same 64-Byte Cache Line - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization

Take Advantage of Write Combining

TOP
Avoid Placing Code and Data in the Same 64-Byte Cache
Line
TOP
50
Operating system and device driver programmers should take
a dva n t a g e o f t h e w ri t e -c o m b i n i n g c a p ab il it ie s o f t h e
AMD Athlon processor. The AMD Athlon processor has a very
aggressive write-combining algorithm, which improves
performance significantly.
See Appendix C, "Implementation of Write Combining" on
page 155 for more details.
Sharing code and data in the same 64-byte cache line may cause
the L1 caches to thrash (unnecessary castout of code/data) in
order to maintain coherency between the separate instruction
and data caches. The AMD Athlon processor has a cache-line
size of 64-bytes, which is twice the size of previous processors.
Programmers must be aware that code and data should not be
shared within this larger cache line, especially if the data
becomes modified.
For example, programmers should consider that a memory
indirect JMP instruction may have the data for the jump table
residing in the same 64-byte cache line as the JMP instruction,
which would result in lower performance.
Although rare, do not place critical code at the border between
32-byte aligned code segments and a data segments. The code
at the start or end of your data segment should be as rarely
executed as possible or simply padded with garbage.
In general, the following should be avoided:
self-modifying code
storing data in code segments
22007E/0—November 1999
Take Advantage of Write Combining

Advertisement

Table of Contents
loading

Table of Contents