Detection Of Server Fault; Prevention Of Server Fault; Management Of Server Operation Status - NEC NovaScale R630 User Manual

Table of Contents

Advertisement

Detection of Server Fault

NEC ESMPRO Manager and NEC ESMPRO Agent detect errors causing faults to occur at an early stage and notify
Administrators of fault information real-time.
Early detection of error
If a fault occurs, NEC ESMPRO Agent detects the fault and reports the occurrence of the fault to NEC
ESMPRO Manager (alert report). NEC ESMPRO Manager displays the received alert in the alert viewer and
also changes the status colors of the server and server component in which the fault occurs. This allows you to
identify the fault at a glance. Further, checking the content of the fault and the countermeasures, you can take
appropriate action for the fault as soon as possible.
Types of reported faults
The table below lists the typical faults reported by NEC ESMPRO Agent.
Component
CPU
Memory
Power supply
Temperature
Fan
Storage
LAN

Prevention of Server Fault

NEC ESMPRO Agent includes the preventive maintenance function forecasting the occurrence of a fault as
countermeasures for preventing faults from occurring.
NEC ESMPRO Manager and NEC ESMPRO Agent can set the threshold for the CPU usage rate and the empty capacity
in a file system, etc. in the server. If the value of a source exceeds the threshold, NEC ESMPRO Agent reports the alert
to NEC ESMPRO Manager.
The preventive maintenance function can be set for a variety of monitoring items including the CPU usage rate.

Management of Server Operation Status

NEC ESMPRO Agent manages and monitors a variety of components installed in the server. You can view the
information managed and monitored by NEC ESMPRO Agent on the data viewer of NEC ESMPRO Manager.
NEC ESMPRO Agent also manages and monitors components and conditions required to keep the server reliability at a
high level such as hard disks, CPU, fans, power supply, and temperature.
Reported information
CPU load is over the threshold
CPU degrading, etc.
ECC 1-bit error detection, etc.
Voltage lowering
Power failure, etc.
Temperature increase in chassis, etc.
Fan failure (decrease in the number of revolutions), etc.
File system usage rate, etc.
Line fault threshold over
Send retry or send abort threshold over, etc.
5-9

Advertisement

Table of Contents

Troubleshooting

loading

Table of Contents