Chapter 5. U.2 Nvme Cache Drive Post-Installation Tasks; Recreating The Cache Raid 0 Volume; Confirming The Volume Is Ready - Nvidia DGX-2 System Service Manual

Hide thumbs Also See for DGX-2 System:
Table of Contents

Advertisement

Chapter 5.
This chapter describes the tasks that are typically needed after replacing a U.2 NVME drive or
upgrading from 8 to 16 drives.
5.1. 

Recreating the Cache RAID 0 Volume

1. Stop cachefilesd.
sudo systemctl stop cachefilesd
$
2. Umount
and stop raid-0.
/raid
sudo umount –f /raid
$
sudo mdadm –-stop /dev/md1
$
3. Run the script to rebuild the RAID volume.
sudo /usr/bin/configure_raid_array.py –c –f
$
Press Y at any questions.
4. When completed, confirm that the
df -hl /raid
$
The
filesystem should be mounted on
/dev/md1
depending on whether 8 or 16 drives are installed.
5.2. 

Confirming the Volume is Ready

1. Confirm the storage devices and volumes in the system are healthy using the following
command.
sudo nvsm show systems/localhost/storage/volumes/md1
$
2. Verify
Status_Health=OK
expected.
3. Confirm that the drives are now available.
sudo mdadm -D /dev/md1
$
If the drive manufacturer is Micron, perform the steps in
DGX-2 System
U.2 NVMe Cache Drive
Post-Installation Tasks
volume is mounted.
/raid
and that the numbers of drives listed in
with size 28 TB or 56 TB,
/raid
Drives =
Enabling the Temperature
DU-09224-001 _v09   |   13
is as
Sensor.

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents