Chapter 7. U.2 Nvme Cache Drive Replacement; Nvme Cache Drive Replacement Overview; Identifying The Failed U.2 Nvme - Nvidia DGX A100 Service Manual

System
Hide thumbs Also See for DGX A100:
Table of Contents

Advertisement

Chapter 7.
7.1. 
U.2 NVMe Cache Drive Replacement
Overview
This is a high-level overview of the procedure to replace a cache Non-Volatile Memory Express
(NVMe) drive.
1. Identify the failed U.2 NVMe drive.
2. Order a replacement from NVIDIA Enterprise Support.
3. Use nvsm to prepare the drive for removal - look for the white LED.
4. Replaced the failed NVMe drive.
5. Rebuild the RAID volume and remount the
6. Confirm the system is healthy by running
7. Ship the failed unit back to NVIDIA Enterprise Support using the provided packaging.
7.2. 

Identifying the Failed U.2 NVMe

Identifying the Failed NVMe from the Front
If physical access to the system is available, you can identify a failed drive by the illuminated
amber LED .
NVIDIA DGX A100 System
U.2 NVMe Cache Drive
Replacement
/raid
nvsm show health
partition.
.
DU-10044-001 _v01   |   27

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents