Dcu Performance; Pipeline Stalls - IBM PowerPC 405GP User Manual

Embedded processor
Table of Contents

Advertisement

DCU tag information is placed into the GPR as shown:
0:19
TAG
Cache Tag
20:25
Reserved
26
D
Cache Line Dirty
o
Not dirty
1 Dirty
27
V
Cache Line Valid
o
Not valid
1 Valid
28:30
Reserved
31
LRU
Least Recently Used (LRU)
o
A-way LRU
1 B-way LRU
Note: A "dirty" cache line is one which has been accessed by a store instruction after it was
established, and can be inconsistent with external memory.
4.5
DCU Performance
DCU performance depends upon the application, but, in general, cache hits complete in one cycle
without stalling the CPU pipeline. Under certain conditions and limitations of the DCU, the pipeline
stalls (stops executing instructions) until the DCU completes current operations.
Several factors affect DCU performance, including:
• Pipeline stalls
• DCU priority
• Simultaneous cache operations
• Sequential cache operations
4.5.1
Pipeline Stalls
The CPU issues commands for cache operations to the DCU. If the DCU can immediately perform the
requested cache operation, no pipeline stall occurs. In some cases, however, the DCU cannot
immediately perform the requested cache operation, and the pipeline stalls until the DCU can perform
the pending cache operation.
In general, the DCU, when hitting in the cache array, can execute a load/store every cycle. If a cache
miss occurs, the DCU must retrieve the line from main memory. For cache misses, the DCU stores
the cache line in a line buffer until the entire cache line is received. The DCU can accept new DCU
commands while the fill progresses. If the instruction causing the line fill is a load, the target word is
bypassed to the GPR during the cycle after it becomes available in the fill buffer. When the fill buffer is
full, it must be moved into the tag and data arrays. During this time, the DCU cannot begin a new
cache operation and stalls the pipeline if new DCU commands are presented. Storing a line in the line
buffer takes 3 cycles, unless the line being replaced has been modified. In that case, the operation
takes 4 cycles.
4-16
PPC405GP User's Manual
Preliminary

Advertisement

Table of Contents
loading

Table of Contents