Archiving The Update Files - IBM TotalStorage NAS Gateway 500 Service Manual

Hide thumbs Also See for TotalStorage NAS Gateway 500:
Table of Contents

Advertisement

Archiving the update files

In the event that it becomes necessary to restore the server to a certain firmware level, you should identify
and archive the materials for each update you install. If the download process produced diskettes, label
and store them in a safe place. If the download process produced files, archive and identify the files for
convenient retrieval.
Configuring and deconfiguring processors or memory
All failures that crash the system with a machine check or check stop, even if intermittent, are reported as
a diagnostic callout for service repair. To prevent the recurrence of intermittent problems and improve the
availability of the system until a scheduled maintenance window, processors and DIMMs with a failure
history are marked "bad" to prevent them from being configured on subsequent boots. This function is
called repeat gard.
A processor or DIMM is marked "bad" under the following circumstances:
v A processor or DIMM fails built-in self-test (BIST) or power-on self-test (POST) testing during boot (as
determined by the service processor).
v A processor or DIMM causes a machine check or check stop during runtime, and the failure can be
isolated specifically to that processor or DIMM (as determined by the processor runtime diagnostics in
the service processor).
v A processor or DIMM reaches a threshold of recovered failures that results in a predictive callout (as
determined by the processor run-time diagnostics in the service processor).
During boot time, the service processor does not configure processors or DIMMs that are marked "bad."
If a processor or DIMM is deconfigured, the processor or DIMM remains offline for subsequent reboots
until it is replaced or memory repeat gard is disabled. The repeat gard function also provides the user with
the option of manually deconfiguring a processor or DIMM, or re-enabling a previously deconfigured
processor or DIMM.
For information about configuring or deconfiguring a processor, see the Processor
Configuration/Deconfiguration Menu on page 269. For information on configuring or deconfiguring a DIMM,
see the Memory Configuration/Deconfiguration Menu on page 270. Both of these menus are submenus
under the System Information Menu. You can enable or disable CPU Repeat Gard or Memory Repeat
Gard using the Processor Configuration/Deconfiguration Menu.
Run-time CPU deconfiguration (CPU repeat gard)
L1 instruction cache recoverable errors, L1 data cache correctable errors, and L2 cache correctable errors
are monitored by the processor runtime diagnostics (PRD) code running in the service processor. When a
predefined error threshold is met, an error log with warning severity and threshold exceed. The NAS
Gateway 500 will attempt to migrate all resources associated with that processor to another processor and
then stop the defective processor.
Service processor system monitoring - surveillance
Surveillance is a function in which the service processor monitors the system, and the system monitors the
service processor. This monitoring is accomplished by periodic samplings called heartbeats.
Surveillance is available during the following phases:
v System firmware bring-up (automatic)
v Operating system runtime (optional)
Chapter 9. Using the service processor
287

Advertisement

Table of Contents
loading

Table of Contents