The Ras Strategy - IBM z13s Technical Manual

Table of Contents

Advertisement

9.1 The RAS strategy

The RAS strategy is to manage change by learning from previous generations and investing
in new RAS function to eliminate or minimize all sources of outages. Enhancements to
z Systems RAS designs are implemented on the z13s system through the introduction of new
technology, structure, and requirements. Continuous improvements in RAS are associated
with new features and functions to ensure that z Systems servers deliver exceptional value to
clients.
As described throughout this book, the z13s server introduced several changes from prior
z Systems generations. Although the RAS design objective has not changed, many new RAS
functions were introduced to mitigate changes in the server design. The RAS design on z13s
servers is based on continuous improvements to address changes in technology, structure,
complexity, and touches.
9.2 Availability characteristics
The following functions include the availability characteristics on z13s servers:
Concurrent LIC memory upgrade
Memory can be upgraded concurrently by using LIC configuration code (LICCC) update if
physical memory is available in the CPC drawers. If the physical memory cards must be
changed, the z13s server needs to be powered down. To help ensure that the appropriate
level of memory is available in a configuration, consider the plan-ahead memory feature.
The plan-ahead memory feature that is available with z13s servers provides the ability to
plan for nondisruptive memory upgrades by having the system pre-plugged with dual inline
memory modules (DIMMs) based on a target configuration. Pre-plugged memory is
enabled when you place an order through LICCC.
Enhanced Driver Maintenance (EDM)
One of the greatest contributors to downtime during planned outages is LIC driver updates
performed in support of new features and functions. z13s servers are designed to support
activating a selected new driver level concurrently.
Concurrent fanout addition or replacement
A Peripheral Component Interconnect Express (PCIe) fanout, or host channel adapter
(HCA2) fanout card provides the path for data between memory and I/O by using PCIe or
InfiniBand cables. There is also a PCIe Integrated Coupling Adapter (ICA SR) fanout for
external coupling connections. With z13s servers, a hot-pluggable and concurrently
upgradeable fanout card is available.
Up to eight fanout cards are available per CPC drawer for a total of 16 fanout cards when
both CPC drawers are installed. During an outage, a fanout card that is used for internal
I/O can be concurrently repaired while redundant I/O interconnect ensures that no I/O
connectivity is lost.
Dynamic oscillator switchover
z13s servers have two oscillator cards: A primary and a backup. During a primary card
failure, the backup card is designed to transparently detect the failure, switch over, and
provide the clock signal to the system.
IBM zAware
IBM z Systems Advanced Workload Analysis Reporter (IBM zAware) is an availability
feature that is designed to use near real-time continuous learning algorithms, providing a
356
IBM z13s Technical Guide

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents