Introduction; Overview; Operating Environment - NEC Express5800/A1040b User Manual

Machine check monitoring service
Hide thumbs Also See for Express5800/A1040b:
Table of Contents

Advertisement

1.

Introduction

1.1

Overview

Machine Check Monitoring Service provides a service to identify fault component of hardware by
sending logs of correctable error occurred on CPU and memory of Linux server to the firmware in the
server.
If the number of times correctable error occurrence exceeds threshold value, Machine Check
Monitoring Service performs Core Offline (offlining of CPU) or Page Offline (offlining memory page) to
prevent system down due to uncorrectable error. If the OS supports Core Online feature and the
system has spare CPU, Machine Check Monitoring Service adds spare CPU automatically (Core
Online) after Core Offline completes. The Offline and Online operations are performed in cooperation
with kernel on Linux server.
Machine Check Monitoring Service is composed of firmware and software on Linux server. Software
includes mcemonitor (Machine Check Monitoring Service) and capmonitor (Capacity Monitoring
Service).
Note
1.2

Operating Environment

Machine Check Monitoring Service requires operating environment as shown below:
Hardware
OS
Refer to "Capacity Optimization (COPT) User's Guide" for details of Core
Online feature.
Core Offline, Core Online, and Page Offline are not supported on
Express5800/A1040b.
Table 1-1 Operating Environment
Express5800/A1040b
Express5800/A2010b
Express5800/A2020b
Express5800/A2040b
Red Hat Enterprise Linux 6.6
1

Advertisement

Table of Contents
loading

This manual is also suitable for:

Express5800/a2010bExpress5800/a2020bExpress5800/a2040b

Table of Contents