Dell PowerScale OneFS Reference Manual page 158

Table of Contents

Advertisement

LED
Blinking amber
No light
3. If only one node reports the issue, determine the cause of the problem by performing the following steps.
CAUTION:
Do not move the power cable to another power supply in the same node as this will cause the node
to lose power.
● Locate the electrical outlet to which the problematic power supply is connected, and then determine if the outlet is
functioning properly by plugging the power cable into a different electrical outlet.
● If the issue is not resolved by using a different electrical outlet, move the power cable from the power supply that reports
the failure to the power supply of a node that does not report a failure. If the cable is the issue, replace the cable.
4. If the issue persists, take one power supply out of a different working node and attach the power supply to the affected
node.
CAUTION:
Do not switch power supplies in the same node as this will cause the node to lose power.
● If the issue follows the power supply, the power supply must be replaced.
5. If multiple nodes report power supply issues, it is likely that the issue is environmental. Check each of the following items to
confirm the health of the power subsystem:
● Power Distribution Unit (PDU) functionality and status of any circuit breakers in the power path
● Power quality such as voltage, frequency values, and stability
● Uninterruptible Power Supply (UPS) health
6. If the issue is not constant and is limited to one node, move the power to another circuit. Next, one at a time, move both
power supplies.
If the above steps do not resolve the issue, gather logs, and then contact Technical Support for additional troubleshooting. For
instructions on how to gather cluster logs, see
900080026
The internal or ambient temperature around a node has exceeded the allowable threshold for the CPU.
Description
Ambient temperature is only measured by front panel sensors. If you receive an event that indicates that the front panel is out
of specification, the temperature in your data center might need to be adjusted.
If a node is subjected to high temperatures for an extended period of time, the CPU is throttled and the node goes into read
only-mode to help prevent potential data loss due to component failure. If the node temperature reaches critical levels, it is
possible that the node will shut down entirely.
Administrator action
Perform the following steps in the order listed. If the issue resolves after a step, there is no need to complete the subsequent
steps.
● (HD400 only) Make sure that the drive drawer is properly shut by sliding it out and re-closing it firmly but carefully.
● Review the temperature statistics for the affected sensor, which are included in the event. If the temperature is consistently
elevated, the problem is likely a high ambient temperature in the data center. Address any changes in the cluster
environment such as air conditioning outages.
● Verify that air flow within the rack, and through the front and rear panel vents of the node, is not obstructed in any way.
● Make sure that the faceplate on the affected node is installed, properly seated, and undamaged. In some cases, removing
and re-seating the faceplate will resolve this issue.
● Run the isi_hw_status command. Review the output to determine whether there is a slow or failed fan that was not
otherwise reported.
● Check for high CPU and disk usage in the node. High usage can contribute to high temperatures within the node.
158
Hardware events
Power status
A power supply failure has occurred
Insufficient or no A/C power
Gathering cluster
logs.
Node type
X-Series, S-Series
All nodes

Advertisement

Table of Contents
loading

Table of Contents