IBM Power 750 Technical Overview And Introduction page 169

Hide thumbs Also See for Power 750:
Table of Contents

Advertisement

Advanced memory buffer chips are exclusive to IBM and help to increase performance, acting
as read/write buffers. The Power 750 and the Power 760 use one memory controller.
Advanced memory buffer chips are on the memory cards and support four DIMMs each.
Memory page deallocation
Although coincident cell errors in separate memory chips are statistically rare, IBM
POWER7+ processor-based systems can contain these errors using a memory page
deallocation scheme for partitions running IBM AIX and IBM i operating systems, and also for
memory pages owned by the POWER Hypervisor. If a memory address experiences an
uncorrectable or repeated correctable single cell error, the service processor sends the
memory page address to the POWER Hypervisor to be marked for deallocation.
Pages used by the POWER Hypervisor are deallocated as soon as the page is released.
In other cases, the POWER Hypervisor notifies the owning partition that the page must be
deallocated. Where possible, the operating system moves any data currently contained in that
memory area to another memory area and removes the pages associated with this error from
its memory map, no longer addressing these pages. The operating system performs memory
page deallocation without any user intervention and is transparent to users and applications.
The POWER Hypervisor maintains a list of pages marked for deallocation during the current
platform initial program load (IPL). During a partition IPL, the partition receives a list of all the
bad pages in its address space. In addition, if memory is dynamically added to a partition
(through a dynamic LPAR operation), the POWER Hypervisor warns the operating system
when memory pages are included that need to be deallocated.
Finally, If an uncorrectable error in memory is discovered, the logical memory block
associated with the address with the uncorrectable error is marked for deallocation by the
POWER Hypervisor. This deallocation will take effect on a partition reboot if the logical
memory block is assigned to an active partition at the time of the fault.
In addition, the system will deallocate the entire memory group associated with the error on
all subsequent system reboots until the memory is repaired. This precaution is intended to
guard against future uncorrectable errors while waiting for parts replacement.
Memory persistent deallocation
Defective memory discovered at boot time is automatically switched off. If the service
processor detects a memory fault at boot time, it marks the affected memory as bad so that it
is not used on subsequent reboots.
If the service processor identifies faulty memory in a server that includes CoD memory, the
POWER Hypervisor attempts to replace the faulty memory with available CoD memory. Faulty
resources are marked as deallocated, and working resources are included in the active
memory space. Because these activities reduce the amount of CoD memory available for
future use, repair of the faulty memory must be scheduled as soon as convenient.
Upon reboot, if not enough memory is available to meet minimum partition requirements, the
POWER Hypervisor will reduce the capacity of one or more partitions.
Depending on the configuration of the system, the HMC Service IBM Focal Point™, OS
Service Focal Point, or service processor will receive a notification of the failed component,
and will trigger a service call.
155
Chapter 4. Continuous availability and manageability

Hide quick links:

Advertisement

Table of Contents
loading

This manual is also suitable for:

Power 760

Table of Contents