Instruction Cache; Figure 1. Amd Athlon™ Processor Block Diagram - AMD Athlon Processor x86 Optimization Manual

X86 code optimization
Table of Contents

Advertisement

22007E/0—November 1999
2-Way, 64-Kbyte Instruction Cache
24-Entry L1 TLB/256-Entry L2 TLB
Fetch/Decode
Control
Integer Scheduler (18-Entry)
IEU0 AGU0
Bus
Interface
Unit
System Interface
Figure 1. AMD Athlon™ Processor Block Diagram

Instruction Cache

AMD Athlon™ Processor Microarchitecture
3-Way x86 Instruction Decoders
Instruction Control Unit (72-Entry)
IEU1
AGU1
IEU2 AGU2
Load / Store Queue Unit
2-Way, 64-Kbyte Data Cache
32-Entry L1 TLB/256-Entry L2 TLB
The out-of-order execute engine of the AMD Athlon processor
contains a very large 64-Kbyte L1 instruction cache. The L1
instruction cache is organiz ed as a 64-Kbyte, two-way,
set-associative array. Each line in the instruction array is 64
bytes long. Functions associated with the L1 instruction cache
are instruction loads, instruction prefetching, instruction
predecoding, and branch prediction. Requests that miss in the
L1 instruction cache are fetched from the backside L2 cache or,
subsequently, from the local memory using the bus interface
unit (BIU).
The instruction cache generates fetches on the naturally
aligned 64 bytes containing the instructions and the next
sequential line of 64 bytes (a prefetch). The principal of
program spatial locality makes data prefetching very effective
and avoids or reduces execution stalls due to the amount of
t i m e wa s t e d re a d i n g t h e n e c e s s a ry d a t a . C a ch e l i n e
AMD Athlon™ Processor x86 Code Optimization
Predecode
Branch
Cache
Prediction Table
FPU Stack Map / Rename
FPU Scheduler (36-Entry)
FPU Register File (88-Entry)
FMUL
FADD
MMX
MMX™
!
3DNow
3DNow!™
L2 Cache
FSTORE
Controller
L2 SRAMs
131

Advertisement

Table of Contents
loading

Table of Contents