Chapter 7.
7.1.
U.2 NVMe Cache Drive Replacement
Overview
This is a high-level overview of the procedure to replace a cache Non-Volatile Memory Express
(NVMe) drive.
1. Identify the failed U.2 NVMe drive.
2. Order a replacement from NVIDIA Enterprise Support.
3. Use nvsm to prepare the drive for removal - look for the white LED.
4. Replaced the failed NVMe drive.
5. Rebuild the RAID volume and remount the
6. Confirm the system is healthy by running
7. Ship the failed unit back to NVIDIA Enterprise Support using the provided packaging.
7.2.
Identifying the Failed U.2 NVMe
Identifying the Failed NVMe from the Front
If physical access to the system is available, you can identify a failed drive by the illuminated
amber LED .
NVIDIA DGX A100 System
U.2 NVMe Cache Drive
Replacement
/raid
nvsm show health
partition.
.
DU-10044-001 _v01 | 27