Align Data Where Possible; Use The 3Dnow! Prefetch And Prefetchw Instructions - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization

Align Data Where Possible

TOP
Use the 3DNow!™ PREFETCH and PREFETCHW Instructions
TOP
46
In general, avoid misaligned data references. All data whose
size is a power of 2 is considered aligned if it is naturally
aligned. For example:
QWORD accesses are aligned if they access an address
divisible by 8.
DWORD accesses are aligned if they access an address
divisible by 4.
WORD accesses are aligned if they access an address
divisible by 2.
TBYTE accesses are aligned if they access an address
divisible by 8.
A misaligned store or load operation suffers a minimum
one-cycle penalty in the AMD Athlon processor load/store
pipeline. In addition, using misaligned loads and stores
increases the likelihood of encountering a store-to-load
forwarding pitfall. For a more detailed discussion of store-to-
load forwarding issues, see "Store-to-Load Forwarding
Restrictions" on page 51.
For code that can take advantage of prefetching, use the
3DNow! PREFETCH and PREFETCHW instructions to
increase the effective bandwidth to the AMD Athlon processor.
Th e P R E F E T C H a n d P R E F E T C H W i n s t r u c t i o n s t a ke
advantage of the AMD Athlon processor's high bus bandwidth
to hide long latencies when fetching data from system memory.
The prefetch instructions are essentially integer instructions
and can be used anywhere, in any type of code (integer, x87,
3DNow!, MMX, etc.).
Large data sets typically require unit-stride access to ensure
that all data pulled in by PREFETCH or PREFETCHW is
actually used. If necessary, algorithms or data structures should
be reorganized to allow unit-stride access.
Use the 3DNow!™ PREFETCH and PREFETCHW
22007E/0—November 1999

Advertisement

Table of Contents
loading

Table of Contents