Replacing The Dimm - Nvidia DGX A100 Service Manual

System
Hide thumbs Also See for DGX A100:
Table of Contents

Advertisement

Inspect the
component_id =
shows a DIMM ID of A1.
Properties:
system_name = ....
component_id = CPU1_DIMM_A1
...
The output provides other information about the alert that can be provided to NVIDIA
Enterprise Support.
3. Determine the DIMM manufacturer.
sudo nvsm show memory
$
4. Request the replacement DIMM from NVIDIA Enterprise Support, specifying the
manufacturer.
11.3.  Replacing the DIMM
Before attempting to replace any of the dual inline memory modules (DIMMs), be sure to have
performed the following:
Determined the location ID of the faulty DIMM needing replacement as explained in
Identifying the Failed
A1, B0, B1, etc.
Obtained the replacement DIMM and have saved the packaging for use when returning the
faulty DIMM.
CAUTION: Static Sensitive Devices: - Be sure to observe best practices for electrostatic
discharge (ESD) protection. This includes making sure personnel and equipment are
connected to a common ground, such as by wearing a wrist strap connected to the
chassis ground, and placing components on static-free work surfaces.
1. Power down the system.
2. Label all cables connected to the motherboard tray for easy identification when
reconnecting.
3. Remove the motherboard tray.
Refer to the instructions in the section
4. Using the diagram label on the lid as a guide, locate the faulty DIMM to be replaced.
NVIDIA DGX A100 System
line to determine the DIMM ID. The following example
DIMM. The location ID is an alpha-numeric designator, such as A0,
Removing the Motherboard
DIMM Replacement
Tray.
DU-10044-001 _v01   |   44

Hide quick links:

Advertisement

Table of Contents
loading

Table of Contents