Instruction Decoding Optimizations; Overview - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

22007E/0—November 1999

Overview

Overview
Instruction Decoding
Optimizations
This chapter discusses ways to maximize the number of
instructions decoded by the instruction decoders in the
AMD Athlon™ processor. Guidelines are listed in order of
importance.
The AMD Athlon processor instruction fetcher reads 16-byte
aligned code windows from the instruction cache. The
instruction bytes are then merged into a 24-byte instruction
queue. On each cycle, the in-order front-end engine selects for
decode up to three x86 instructions from the instruction-byte
queue.
All instructions (x86, x87, 3DNow!™, and MMX™) are
cla ssified int o t wo types of dec odes — D i rect Pat h and
VectorPath (see "DirectPath Decoder" and "VectorPath
Decoder" on page 133 for more information). DirectPath
instructions are common instructions that are decoded directly
in hardware. VectorPath instructions are more complex
instructions that require the use of a sequence of multiple
operations issued from an on-chip ROM.
Up to three DirectPath instructions can be selected for decode
per cycle. Only one VectorPath instruction can be selected for
decode per cycle. DirectPath instructions and VectorPath
instructions cannot be simultaneously decoded.
AMD Athlon™ Processor x86 Code Optimization
4
33

Advertisement

Table of Contents
loading

Table of Contents