‣
Use the NVSM CLI as follows.
sudo nvsm show psus
$
The output shows information for each PSU. Look for any that do not report
Status_Health=OK
‣
View the PSU status from the BMC.
Click Sensor from the left side menu and inspect the PSU information from the Normal
Sensors section.
‣
Use
.
ipmitool
sudo ipmitool sdr |grep -i psu
$
Look for power supplies with no temperature reading or an output reading close to or
equal to zero.
Both NVSM and the BMC identify each power supply as PSUx, where x is from 0 to 5. The
following diagram shows the physical location of each PSU.
NVIDIA DGX A100 System
.
Power Supply Replacement
DU-10044-001 _v01 | 8