Rebuilding The Boot Drive Raid 1 Volume - Nvidia DGX-2 System Service Manual

Hide thumbs Also See for DGX-2 System:
Table of Contents

Advertisement

6.4. 
Rebuilding the Boot Drive RAID 1
Volume
After replacing a faulty M.2 OS drive, you must rebuild the RAID 1 array.
1. Turn the DGX-2 System on.
The rebuilding process should begin automatically upon system boot.
2. Log in and then confirm that the RAID 1 array is being rebuilt.
sudo mdadm -D /dev/md0
$
If the RAID 1 array is still in the process of being rebuilt, the output will include the
following line.
Rebuilt Status
If the RAID 1 array rebuilding process is completed, the output will show both drives in
'
' state and you can skip the remaining steps.
active sync
3. If the rebuilding process did not start automatically, then rebuild the array manually.
In the following steps, replace X with the number that corresponds to the replaced drive,
and Y with the number that corresponds to the drive that was not replaced (the surviving
drive). If you did not note this information when identifying the failed drive, then follow the
instructions in the first step of
a). Start an NVSM CLI interactive session and switch to the storage target.
sudo nvsm
$
cd /systems/localhost/storage
nvsm->
b). Start the rebuilding process and be ready to enter the device name of the replaced
drive.
nvsm(/systems/localhost/storage)->
PROMPT: In order to rebuild this volume, a spare drive
is required. Please specify the spare drive to
use to rebuild md0.
Name of spare drive for md0 rebuild (CTRL-C to cancel):
WARNING: Once the volume rebuild process is started, the
process cannot be stopped.
Start RAID-1 rebuild on md0? [y/n]
After entering y at the prompt to start the RAID 1 rebuild, the "Initiating rebuild ..."
message appears.
/systems/localhost/storage/volumes/md0/rebuild started at 2018-10-12
15:27:26.525187
Initiating RAID-1 rebuild on volume md0...
0.0% [\
After about 30 seconds, the "Rebuilding RAID-1 ..." message should appear.
/systems/localhost/storage/volumes/md0/rebuild started at 2018-10-12
15:27:26.525187
Rebuilding RAID-1 rebuild on volume md0...
31.0% [=============/
If this message remains at "Initiating RAID-1 rebuild" for more than 30 seconds, then
there is a problem with the rebuild process. In this case, make sure the name of the
replacement drive is correct and try again.
DGX-2 System
:
XX% complete
Identifying the Faile M.2
start volumes/md0/rebuild
y
M.2 NVMe Boot Drive Replacement
Drive.
nvmeXn1
]
]
DU-09224-001 _v09   |   19

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents