Example Of Latency Hiding With S/W Prefetch Instruction; Figure 6-1 Effective Latency Reduction As A Function Of Access Stride - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization

Figure 6-1 Effective Latency Reduction as a Function of Access Stride

U p p e r b o u n d o f P o in te r -C h a s in g L a te n c y R e d u c tio n
1 2 0 %
1 0 0 %
8 0 %
6 0 %
4 0 %
2 0 %
0 %

Example of Latency Hiding with S/W Prefetch Instruction

Achieving the highest level of memory optimization using prefetch
instructions requires an understanding of the microarchitecture and
system architecture of a given machine. This section translates the key
architectural implications into several simple guidelines for
programmers to use.
Figure 6-2 and Figure 6-3 show two scenarios of a simplified 3D
geometry pipeline as an example. A 3D-geometry pipeline typically
fetches one vertex record at a time and then performs transformation
and lighting functions on it. Both figures show two separate pipelines,
an execution pipeline, and a memory pipeline (front-side bus).
Since the Pentium 4 processor, similarly to the Pentium II and
Pentium III processors, completely decouples the functionality of
execution and memory access, these two pipelines can function
concurrently. Figure 6-2 shows "bubbles" in both the execution and
memory pipelines. When loads are issued for accessing vertex data, the
6-22
S tr i d e (B y te s)
F a m . 1 5 ; M o d e l 3 , 4
F a m . 1 5 ; M o d e l 0 , 1 , 2
F a m . 6 ; M o d e l 1 3
F a m . 6 ; M o d e l 1 4
F a m . 1 5 ; M o d e l 6

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ARCHITECTURE IA-32 and is the answer not in the manual?

Table of Contents

Save PDF