Main Performance Improvement Drivers With Z13S - IBM z13s Technical Manual

Table of Contents

Advertisement

remember both the workload characteristics and LPAR configuration. For these reasons,
when you plan capacity, zPCR and involving IBM technical support are recommended.

12.7.1 Main performance improvement drivers with z13s

The z13s is designed to deliver new levels of performance and capacity for large-scale
consolidation and growth. The following attributes and design points of the z13s contribute to
overall performance and throughput improvements as compared to the zBC12.
The z/Architecture implementation has the following enhancements:
Transactional Execution (TX) designed for z/OS, Java, DB2, and other users
Runtime Instrumentation (RI) provides dynamic and self-tuning online recompilation
capability for Java workloads
Enhanced DAT-2 for supporting 2-GB pages for DB2 buffer pools, Java heap size, and
other large structures
Software directives implementation to improve hardware performance
Decimal format conversions for COBOL programs
The z13s microprocessor design has the following enhancements:
Six or Seven active processor cores per chip
Improved out-of-order (OOO) execution design
Improved pipeline balance, with up to six instructions that can be decoded per cycle, and
up to 10 instructions/operations that can be initiated to run per clock cycle
Simultaneous multithreading
Single-instruction multiple-data (SIMD) unit and 139 new instructions for vector operations
Enhanced branch prediction latency and instruction fetch throughput
Improvements in execution bandwidth and throughput: 10 execution units and two
load/store units, which are divided into two symmetric pipelines:
– Four fixed-point units (FXUs) (integer)
– Two load/store units (LSUs)
– Two binary floating-point units (BFUs)
– Two binary coded decimal floating-point units (DFUs)
– Two vector execution units (VXUs)
Redesigned cache structure:
– Increased L1I and L1D caches (96 KB instruction and 128 KB data per core)
– Increased 2 MB + 2 MB eDRAM split (instruction and data) private L2 cache per core
– On chip 64 MB eDRAM L3 Cache, shared by all cores (six or seven) - 256 MB per CPC
drawer (two nodes)
– New Inclusive L4 Design: 480 MB L4 with 224 MB NIC Directory (960 MB L4 per CPC
drawer) (two nodes)
One cryptographic/compression co-processor per core, redesigned
CPACF (hardware) runs additional UTF conversion operations: UTF8 to UTF32, UTF8 to
UTF16, UTF32 to UTF8, and UTF32 to UTF16
446
IBM z13s Technical Guide

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents