Z13S And Zbc12 Cache Level Comparison - IBM z13s Technical Manual

Table of Contents

Advertisement

Although large caches mean increased access latency, the new technology of CMOS 14S0
(22 nm chip lithography) and the lower cycle time allow z13s servers to increase the size of
cache levels (L1, L2, and L3) within the PU chip by using denser packaging. This design
reduces traffic to and from the shared L4 cache, which is on another chip (SC chip). Only
when a cache miss in L1, L2, or L3 occurs is a request sent to L4. L4 is the coherence
manager, which means that all memory fetches must be in the L4 cache before that data can
be used by the processor.
However, in the z13s cache design, some lines of the L3 cache are not included in the L4
cache. The L4 cache has a NIC directory that has entries that point to the non-inclusive lines
of L3 cache. This design ensures that L3 locally owned lines (same node) can be accessed
over the X-bus by using the intra-node snoop interface without being included in L4.
Inter-node snoop traffic to L4 can still be handled effectively.
Another approach is available for avoiding L4 cache access delays (latency). The L4 cache
straddles up to two CPC drawers and up to four nodes. This configuration means that
relatively long distances exist between the higher-level caches in the processors and the L4
cache content. To overcome the delays that are inherent in the SMP CPC drawer design and
save cycles to access the remote L4 content, keep instructions and data as close to the
processors as possible. You can do so by directing as much work of a particular LPAR
workload to the processors in the same CPC drawer as the L4 cache. This configuration is
achieved by having the IBM Processor Resource/Systems Manager (PR/SM) scheduler and
the z/OS Workload Manager (WLM) and dispatcher work together. Have them keep as much
work as possible within the boundaries of as few processors and L4 cache space (which is
best within a node of a CPC drawer boundary) without affecting throughput and response
times.
Figure 3-3 compares the cache structures of z13s servers with the previous generation of
z Systems, zBC12 servers.
2 L4 Caches
192MB
Shared eDRAM L4
6 L3s,
24MB Shr
eDRAM L3
L2
L2
L2
L2
L2
L2
L1
L1
L1
L1
L1
L1
L1:
64KI + 96KD
6w DL1, 4wIL1
256B line size
Private 1MB inclusive of DL1
L2:
Private 1MB inclusive of IL1
Shared 48MB Inclusive of L2s
L3:
12w Set Associative
256B cache line size
L4:
384MB Inclusive
24w Set Associative
256B cache line size
zBC12 (Per processor drawer)
Figure 3-3 z13s and zBC12 cache level comparison
86
IBM z13s Technical Guide
24MB Shr
eDRAM L3
L2
L2
L2
L2
L2
L2
L1
L1
L1
L1
L1
L1
2 L4 Caches
NIC
480 MB
Directory
Shared eDRAM L4
Intra-node snoop interface (X-Bus)
64 MB Shr
eDRAM L3
L2
L2
L2
L2
L2
L2
L2
L2
L1
L1
L1
L1
L1
L1
L1
L1
L1:
96KI + 128KD
8w DL1, 6wIL1
256B line size
Private 2MB inclusive of DL1
L2:
Private 2MB inclusive of IL1
Shared 64MB Inclusive of L2s
L3:
16w Set Associative
256B cache line size
480MB + 224MB NonData Inclusive
L4:
Coherent Directory
30w Set Associative
256B cache line size
z13s (Per Node – ½ CPC drawer)
64 MB Shr
eDRAM L3
L2
L2
L2
L2
L2
L1
L1
L1
L1
L1

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents