Microarchitecture Pipeline And Hyper-Threading Technology; Front End Pipeline - Intel ARCHITECTURE IA-32 Reference Manual

Architecture optimization
Table of Contents

Advertisement

IA-32 Intel® Architecture Optimization

Microarchitecture Pipeline and Hyper-Threading Technology

This section describes the HT Technology microarchitecture and how
instructions from the two logical processors are handled between the
front end and the back end of the pipeline.
Although instructions originating from two programs or two threads
execute simultaneously and not necessarily in program order in the
execution core and memory hierarchy, the front end and back end
contain several selection points to select between instructions from the
two logical processors. All selection points alternate between the two
logical processors unless one logical processor cannot make use of a
pipeline stage. In this case, the other logical processor has full use of
every cycle of the pipeline stage. Reasons why a logical processor may
not use a pipeline stage include cache misses, branch mispredictions,
and instruction dependencies.

Front End Pipeline

The execution trace cache is shared between two logical processors.
Execution trace cache access is arbitrated by the two logical processors
every clock. If a cache line is fetched for one logical processor in one
clock cycle, the next clock cycle a line would be fetched for the other
logical processor provided that both logical processors are requesting
access to the trace cache.
If one logical processor is stalled or is unable to use the execution trace
cache, the other logical processor can use the full bandwidth of the trace
cache until the initial logical processor's instruction fetches return from
the L2 cache.
After fetching the instructions and building traces of µops, the µops are
placed in a queue. This queue decouples the execution trace cache from
the register rename pipeline stage. As described earlier, if both logical
processors are active, the queue is partitioned so that both logical
processors can make independent forward progress.
1-38

Advertisement

Table of Contents
loading

Table of Contents