L4 Cache And Memory Buffer - IBM Power Systems S822LC Technical Overview And Introduction

Hide thumbs Also See for Power Systems S822LC:
Table of Contents

Advertisement

The on-chip L3 cache is organized into separate areas with differing latency characteristics.
Each processor core is associated with a fast 8 MB local region of L3 cache (FLR-L3), but
also has access to other L3 cache regions as shared L3 cache. Additionally, each core can
negotiate to use the FLR-L3 cache that is associated with another core, depending on
reference patterns. Data can also be cloned to be stored in more than one core's FLR-L3
cache, again depending on reference patterns. This Intelligent Cache management enables
the POWER8 processor to optimize the access to L3 cache lines and minimize overall cache
latencies.
Figure 1-5 on page 9 show the on-chip L3 cache, and highlights the fast 8 MB L3 region that
is closest to a processor core.
The innovation of using eDRAM on the POWER8 processor die is significant for several
reasons:
Latency improvement
A six-to-one latency improvement occurs by moving the L3 cache on-chip compared to L3
accesses on an external (on-ceramic) Application Specific Integrated Circuit (ASIC).
Bandwidth improvement
A 2x bandwidth improvement occurs with on-chip interconnect. Frequency and bus sizes
are increased to and from each core.
No off-chip driver or receivers
Removing drivers or receivers from the L3 access path lowers interface requirements,
conserves energy, and lowers latency.
Small physical footprint
The performance of eDRAM when implemented on-chip is similar to conventional SRAM
but requires far less physical space. IBM on-chip eDRAM uses only a third of the
components that conventional SRAM uses, which has a minimum of six transistors to
implement a 1-bit memory cell.
Low energy consumption
The on-chip eDRAM uses only 20% of the standby power of SRAM.

1.7.6 L4 cache and memory buffer

POWER8 processor-based systems introduce an additional level in memory hierarchy. The
L4 cache is implemented together with the memory buffer in the memory riser cards. Each
memory buffer contains 16 MB of L4 cache. On a Power S822LC server, you can have up to
128 MB of L4 cache by using all the eight memory riser cards.
14
IBM Power Systems S822LC for High Performance Computing

Advertisement

Table of Contents
loading

Table of Contents