Recommendations For The Amd Athlon Processor - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

AMD Athlon™ Processor x86 Code Optimization
Recommendations for the AMD Athlon™ Processor
40
For code that is optimized specifically for the AMD Athlon
processor, the optimal code fillers are NOP instructions (opcode
0x90) with up to two REP prefixes (0xF3). In the AMD Athlon
processor, a NOP with up to two REP prefixes can be handled
by a single decoder with no overhead. As the REP prefixes are
redundant and meaningless, they get discarded, and NOPs are
handled without using any execution resources. The three
decoders of AMD Athlon processor can handle up to three
NOPs, each with up to two REP prefixes each, in a single cycle,
for a neutral code filler of up to nine bytes.
Note: When used as a filler instruction, REP/REPNE prefixes can
be used in conjunction only with NOPs. REP/REPNE has
undefined behavior when used with instructions other than
a NOP.
I f a l a rg e r a m o u n t o f c o d e p a dd i n g i s re q u i re d , i t i s
recommended to use a JMP instruction to jump across the
padding region. The following assembly language macros show
this:
NOP1_ATHLON TEXTEQU <DB 090h>
NOP2_ATHLON TEXTEQU <DB 0F3h, 090h>
NOP3_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h>
NOP4_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 090h>
NOP5_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 090h>
NOP6_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h>
NOP7_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h,
NOP8_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h,
NOP9_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h,
NOP10_ATHLONTEXTEQU <DB 0EBh, 008h, 90h, 90h, 90h, 90h,
090h>
0F3h, 090h>
0F3h, 0F3h, 090h>
90h,
90h, 90h, 90h>
Code Padding Using Neutral Code Fillers
22007E/0—November 1999

Advertisement

Table of Contents
loading

Table of Contents