Download Print this page
Nvidia DGX B200 Service Manual
Nvidia DGX B200 Service Manual

Nvidia DGX B200 Service Manual

Hide thumbs Also See for DGX B200:

Advertisement

Quick Links

NVIDIA DGX B200 Service Manual
NVIDIA Corporation
May 28, 2025

Advertisement

loading
Need help?

Need help?

Do you have a question about the DGX B200 and is the answer not in the manual?

Questions and answers

Summary of Contents for Nvidia DGX B200

  • Page 1 NVIDIA DGX B200 Service Manual NVIDIA Corporation May 28, 2025...
  • Page 3 Contents 1 Introduction Customer-replaceable Components ....... . Recommended Tools ......... . Customer Support .
  • Page 4 Finalize Motherboard Closing ........46 8 U.2 NVMe Cache Drive Replacement U.2 NVMe Cache Drive Replacement Overview .
  • Page 5 15.6 Install the BlueField-3 Card ........87 15.7 Install the I/O Card and Close the System .
  • Page 6 20.10.3 Battery Replacement ........127 20.10.4 Cooling and Airflow .
  • Page 7 NVIDIA DGX B200 Service Manual The NVIDIA DGX B200 Service Manual is also available as a PDF. Contents...
  • Page 8 NVIDIA DGX B200 Service Manual Contents...
  • Page 9 This topic contains instructions for replacing the NVIDIA DGX™ B200 system components. Make sure to familiarize yourself with the NVIDIA Terms and Conditions documents before attempting to perform any modification or repair to the DGX B200 system. These Terms and Conditions for the DGX B200 system can be found through the NVIDIA DGX Systems Support page.
  • Page 10 1.3. Customer Support Contact NVIDIA Enterprise Support for assistance in reporting, troubleshooting, or diagnosing prob- lems with your DGX B200 system. Also contact NVIDIA Enterprise Support for assistance in installing or moving the DGX B200 system. For details on how to obtain support, visit the NVIDIA Enterprise Support web site (https://www.nvidia.
  • Page 11 1.4. Running the Pre-flight Test Instructions for running the DGX stress test. NVIDIA recommends running the pre-flight stress test before putting a system into a production envi- ronment or after servicing. You can specify running the test on the GPUs, CPU, memory, and storage, and also specify the duration of the tests.
  • Page 12 NVIDIA DGX B200 Service Manual Chapter 1. Introduction...
  • Page 13 Chapter 2. Removing and Attaching the Bezel 2.1. Bezel Removal Grab the bezel on both sides by the side handles. Pull the bezel away from the system with a horizontal motion to release it from the magnets that keep it in place.
  • Page 14 NVIDIA DGX B200 Service Manual 2.2. Bezel Installation Align the pins on the bezel to the notches on the system fascia. Attach the bezel to the system, ensuring that the pins fit in the notches and that the magnetic latch holds the bezel securely in place.
  • Page 15 NVIDIA DGX B200 Service Manual 2.2. Bezel Installation...
  • Page 16 NVIDIA DGX B200 Service Manual Chapter 2. Removing and Attaching the Bezel...
  • Page 17 Chapter 3. Power Supply Replacement 3.1. Power Supply Replacement Overview This is a high-level overview of the procedure to replace a power supply on the NVIDIA DGX™ B200 system. Identify the broken power supply by the amber color LED or the power supply number.
  • Page 18 Access the rear of the system and view the status LEDs while the system is powered on. If the PSU is good, both LEDs should be solid green. If either of the LEDs is not green or blinks, contact NVIDIA Enterprise Support to troubleshoot the issue. Chapter 3. Power Supply Replacement...
  • Page 19 NVIDIA DGX B200 Service Manual Running the show psus Command ▶ Run the following command to display information about the PSUs: sudo nvsm show psus The output shows information for each PSU. Look for any that do not report Status_Health=OK.
  • Page 20 NVIDIA DGX B200 Service Manual ▶ Confirm the PSU temperature readings: Run the ipmitool command to view information about the PSUs: sudo ipmitool sdr | grep -i psu Look for power supplies with no temperature or output reading close to or equal to zero.
  • Page 21 Status_State = Present Targets: Verbs: show Obtain the replacement PSU (of the same manufacturer) from NVIDIA Enterprise Support. 3.3. Preparing the Power Supply for Replacement After the new power supply arrives, look at the system and identify which needs to be replaced.
  • Page 22 NVIDIA DGX B200 Service Manual Unplug the power cord from the failed power supply, following the instructions described in Lock- ing Power Cords. Before replacing the power supply, remove the locking power cable. 3.4. Replacing the Power Supply Remove the power supply by pressing the green tab to unlock the unit, and then pull on the black handle.
  • Page 23 NVIDIA DGX B200 Service Manual Caution Once the power supply is out of the chassis, replace it with the new power supply in less than 30 seconds to avoid airflow disruptions in the system - especially if it is up and running.
  • Page 24 NVIDIA DGX B200 Service Manual errors. After the replacement is complete, return the failed power supply to NVIDIA Enterprise Support using the provided packaging. 3.5. Locking Power Cords To use the twisting locking power cords that ship with the system: ▶...
  • Page 25 Replacement 4.1. Front Fan Module Replacement Overview This is a high-level overview of the procedure to replace the front fan modules on the NVIDIA DGX™ B200 system. Identify the failed front fan module through BMC or with the fan module LED and submit a service ticket.
  • Page 26 NVIDIA DGX B200 Service Manual Identify the failed fan using the fan module fault LED, as shown in the following figure. Look for the fault LED lit in the upper right corner of the faulty fan module, as shown in the following figure.
  • Page 27 NVIDIA DGX B200 Service Manual Running the nvsm command ▶ From the operating system, run: sudo nvsm show fans View the command output for alerts, failures, or an unhealthy status. Viewing Fan Modules from the BMC Web User Interface Identify the faulty fan module using the BMC dashboard.
  • Page 28 NVIDIA DGX B200 Service Manual The fan module has two fans, identified by SPD_FAN_SYSn_F and SPD_FAN_SYSn_R, where n is the module ID. If either fan fails, the entire module must be replaced. Use the nvsm command to confirm the fan issue.
  • Page 29 NVIDIA DGX B200 Service Manual Replace the failed fan module with the new one. Important Replace the old fan with the new one within 30 seconds to prevent overheating. Confirm that the fan module is healthy and working correctly by performing the following tasks: ▶...
  • Page 30 Run the sudo nvsm show fans command. ▶ ▶ Install the bezel as described in Removing and Attaching the Bezel. Return the failed fan module to NVIDIA Enterprise Support using the packaging from the new fan module. Chapter 4. Front Fan Module Replacement...
  • Page 31 Chapter 5. Front Console Board Replacement 5.1. Front Console Board Replacement Overview This is a high-level overview of the procedure to replace the front console board on the NVIDIA DGX™ B200 system. Unpack the new front console board. Shut down the system.
  • Page 32 NVIDIA DGX B200 Service Manual Caution Static Sensitive Devices: Be sure to observe the best practices for electrostatic discharge (ESD) protection. This includes ensuring personnel and equipment are connected to a common ground, such as by wearing a wrist strap connected to the chassis ground and placing components on static-free work surfaces.
  • Page 33 Run sudo nvsm show health to verify that the temperature sensor is working properly Replace the bezel as described in Removing and Attaching the Bezel. Send the failed unit to NVIDIA Enterprise Support using the packaging provided. 5.2. Front Console Board Replacement...
  • Page 34 NVIDIA DGX B200 Service Manual Chapter 5. Front Console Board Replacement...
  • Page 35 Chapter 6. Motherboard Tray - Opening and Closing the I/O Door You will need to completely remove the motherboard tray from the server to service the following components. If this is the case, refer to the section that describes the procedure for removing the motherboard.
  • Page 36 NVIDIA DGX B200 Service Manual 6.2. Release the Motherboard Unlock the motherboard by loosening the captive screws that hold the ejection levers in place: Pull the ejection levers to disengage the midplane connectors: Chapter 6. Motherboard Tray - Opening and Closing the I/O Door...
  • Page 37 NVIDIA DGX B200 Service Manual 6.3. Pull the Motherboard from the Chassis Pull the motherboard out until the locking mechanism in the lid engages and prevents further movement. Unscrew the thumbscrews indicated by the green arrows in the following figure to release the lid...
  • Page 38 NVIDIA DGX B200 Service Manual 6.4. Open the Motherboard I/O Door Fold the lid I/O opening section as shown in the following figure: Secure the folding section until it stays in place so you can work on the I/O section of the moth- erboard: Chapter 6.
  • Page 39 NVIDIA DGX B200 Service Manual 6.5. Close the Motherboard I/O Door Before closing the lid, ensure all components are correctly installed and nothing is blocking the lid. Slide the lid as shown in the following figure to close the motherboard I/O section:...
  • Page 40 NVIDIA DGX B200 Service Manual 6.6. Lock the Motherboard Lid Close the lid so that you can lock it in place: Use the thumbscrews indicated in the following figure to secure the lid to the motherboard tray. Open the tray levers: Push the motherboard tray into the system chassis until the levers on both sides engage with the sides.
  • Page 41 NVIDIA DGX B200 Service Manual 6.7. Insert the Motherboard Use the levers to engage the midplane connectors: After the levers are fully closed, tighten the green thumbscrews to hold the ejection levers in place: 6.7. Insert the Motherboard...
  • Page 42 NVIDIA DGX B200 Service Manual 6.8. Finalize Motherboard Closing Use the labels on the cables to reconnect them to the correct ports. Install all power cords. Power on the system. Chapter 6. Motherboard Tray - Opening and Closing the I/O Door...
  • Page 43 Chapter 7. Motherboard Tray - Removal and Installation You will need to completely remove the motherboard tray from the server to service the following components. ▶ DIMMs (either adding or replacing) ▶ Trusted Platform Module (TPM) 7.1. Preparing the Motherboard for Service Before pulling the motherboard out of the system, the system must be shut down and cables must be removed from the system.
  • Page 44 NVIDIA DGX B200 Service Manual 7.2. Release the Motherboard Unlock the motherboard by loosening the captive screws that hold the ejection levers in place: Pull the ejection levers to disengage the midplane connectors: Chapter 7. Motherboard Tray - Removal and Installation...
  • Page 45 NVIDIA DGX B200 Service Manual 7.3. Pull the Motherboard from the Chassis Ensure that you have a solid, flat surface to rest the motherboard tray. Pull the motherboard tray out until the locking mechanism in the lid engages and prevents further movement.
  • Page 46 NVIDIA DGX B200 Service Manual ▶ Do not hold the motherboard tray by the ejection handles. The handles can bend or break. ▶ Be careful with the connectors at the back of the module to prevent damage. Place the motherboard tray on a solid, flat surface.
  • Page 47 NVIDIA DGX B200 Service Manual To remove the tray lid, perform the following steps: Lift on the connector side of the tray lid so that you can push it forward to release it from the tray. After the triangular markers align, lift the tray lid to remove it.
  • Page 48 NVIDIA DGX B200 Service Manual 7.5. Close the Motherboard Tray Lid Before you perform the following steps, ensure that all components are installed correctly so that they do not interfere with the air baffles or tray lid. Insert the motherboard tray baffles and then place the tray lid over the motherboard tray.
  • Page 49 NVIDIA DGX B200 Service Manual Tighten the two lid screws on the connector side of the motherboard tray, as shown in the fol- lowing figure: 7.6. Insert the Motherboard Tray into the Chassis Insert the motherboard tray into the chassis partially. Open the ejection levers before you insert the motherboard tray into the chassis: 7.6.
  • Page 50 NVIDIA DGX B200 Service Manual Push the motherboard tray into the chassis until the levers on both sides engage with the sides: Chapter 7. Motherboard Tray - Removal and Installation...
  • Page 51 NVIDIA DGX B200 Service Manual 7.7. Insert the Motherboard Use the levers to engage the midplane connectors: After the levers are fully closed, tighten the green thumbscrews to hold the ejection levers in place: 7.7. Insert the Motherboard...
  • Page 52 NVIDIA DGX B200 Service Manual 7.8. Finalize Motherboard Closing Use the labels on the cables to reconnect them to the correct ports. Install all power cords. Power on the system. Chapter 7. Motherboard Tray - Removal and Installation...
  • Page 53 Insert the new SSD. Power on the system. Rebuild the RAID volume and mount the filesystem. Return the failed unit to NVIDIA Enterprise Support using the packaging provided. 8.2. Identifying the Failed U.2 NVMe SSD Identifying the Failed NVMe from the Front If physical access to the system is available, you can identify a failed drive by the illuminated amber LED.
  • Page 54 NVIDIA Enterprise Support, specifying this information. 8.4. Replacing the U.2 NVMe Drive Ensure that you requested and obtained the replacement drive from NVIDIA Enterprise Support. Back up any critical data to a network shared volume or other backup means. Power off the system using the power button.
  • Page 55 NVIDIA DGX B200 Service Manual After the system powers off, use the following figure to identify the drive to replace in the chassis. The figures in the following procedures show replacing drive number 7 at PCI address ae. Remove the NVMe drive.
  • Page 56 NVIDIA DGX B200 Service Manual Use the handle on the drive to secure it in place: Confirm that the drive is flush with the system: Install the bezel after the drive replacement is complete. 8.6. Next Steps ▶ U.2 NVMe Cache Drive Post-Installation Tasks.
  • Page 57 Chapter 9. U.2 NVMe Cache Drive Post-Installation Tasks This section describes the tasks that you typically need to perform after replacing a U.2 NVMe drive. 9.1. Re-creating the RAID Arrays Power on the system and log in. Confirm that all installed drives are visible from the OS by using the nvme command: sudo nvme list The output can indicate two boot drives and eight cache drives, depending on how many are installed in the system.
  • Page 58 DGX OS 7.0 User Guide. Confirm the volume is healthy: sudo nvsm show volumes Send the old drive to NVIDIA Enterprise Support using the packaging from the new drive. Chapter 9. U.2 NVMe Cache Drive Post-Installation Tasks...
  • Page 59 Chapter 10. M.2 NVMe Boot Drive Replacement This topic describes how to replace the boot drive in the NVIDIA DGX™ B200 system. Caution Static Sensitive Devices: Be sure to observe best practices for electrostatic discharge (ESD) pro- tection. This includes ensuring personnel and equipment are connected to a common ground, such as by wearing a wrist strap connected to the chassis ground and placing components on static-free work surfaces.
  • Page 60 10.2. Identify the Failed M.2 Drive The NVIDIA DGX™ B200 system automatically sets the failed M.2 drive offline when it detects the fail- ure. The boot drives are mirrored, so the mdadm command-line utility can identify the drive to replace.
  • Page 61 NVIDIA DGX B200 Service Manual Rotate the locking mechanism for the PCI carrier out of the way: Loosen the captive screw on the support bracket of the M.2 riser card: 10.3. Remove the M.2 Boot Drive Carrier...
  • Page 62 NVIDIA DGX B200 Service Manual Pull the M.2 riser card from the slot: Lift the M.2 riser card to remove it from the system: Chapter 10. M.2 NVMe Boot Drive Replacement...
  • Page 63 NVIDIA DGX B200 Service Manual 10.4. Remove the M.2 Drive Before attempting to remove one of the M.2 NVMe drives, perform the following prerequisites: ▶ Determine the location ID of the faulty M.2 drive. ▶ Obtain the replacement M.2 drive and save the packaging for returning the faulty drive.
  • Page 64 NVIDIA DGX B200 Service Manual Pull the left end of the M.2 drive up about 30˚: Release the M.2 drive from the connector: Chapter 10. M.2 NVMe Boot Drive Replacement...
  • Page 65 NVIDIA DGX B200 Service Manual 10.5. Replace the M.2 Drive To insert the M.2 drive, set it at an angle and insert it into the connector: Lower the M.2 drive and align it with the screw post: Install and tighten the screw to secure the drive to the riser:...
  • Page 66 NVIDIA DGX B200 Service Manual 10.6. Install the M.2 Boot Drive Carrier and Close the System Lower the M.2 riser card into the slot: Install the M.2 carrier card into the PCI riser by aligning it with the slot and then pressing it against the PCI slot riser: Tighten the captive screw on the support bracket of the M.2 PCI riser card:...
  • Page 67 NVIDIA DGX B200 Service Manual Close the latch to secure the M.2 carrier card and secure it in place: Tighten the thumbscrew to ensure the locking mechanism stays in place: 10.6. Install the M.2 Boot Drive Carrier and Close the System...
  • Page 68 In this case, ensure the name of the replacement drive is correct and try again. Use the packaging from the new drive to send the failed drive to NVIDIA Enterprise Support. Note If your organization purchased a media retention policy, you might be able to keep the failed drives for destruction.
  • Page 69 If your organization purchased a media retention policy, you might be able to keep the failed drives for destruction. Check with NVIDIA Enterprise Support on the status of the policy for specifics. Get a replacement M.2 boot drive assembly from NVIDIA Enterprise Support.
  • Page 70 This failure is hard to diagnose because the system does not boot as both boot drives are unavailable. After the replacement part arrives from NVIDIA, shut down the system and proceed by opening the I/O door of the motherboard. Refer to...
  • Page 71 NVIDIA DGX B200 Service Manual Loosen the captive screw on the support bracket of the M.2 riser card: Pull the M.2 riser card from the slot: Lift the M.2 riser card to remove it from the system: 11.3. Remove the M.2 Boot Drive Carrier...
  • Page 72 NVIDIA DGX B200 Service Manual 11.4. Install the M.2 Boot Drive Carrier and Close the System Lower the M.2 riser card into the slot: Install the M.2 carrier card into the PCI riser by aligning it with the slot and then pressing it against the PCI slot riser: Chapter 11.
  • Page 73 NVIDIA DGX B200 Service Manual Tighten the captive screw on the support bracket of the M.2 PCI riser card: Close the latch to secure the M.2 carrier card and secure it in place: 11.4. Install the M.2 Boot Drive Carrier and Close the System...
  • Page 74 Reinstall the system following the instructions in the DGX OS User Guide. Confirm the system is in working order by running: sudo nvsm show health Use the packaging from the new component to send the failed unit to NVIDIA Enterprise Support. Chapter 11. M.2 Boot Drive Assembly Replacement...
  • Page 75 Chapter 12. ConnectX-7 I/O Replacement This topic describes how to replace the ConnectX-7 I/O card in the NVIDIA DGX™ B200 system. 12.1. ConnectX-7 I/O Card Replacement Overview This is a high-level overview of the procedure to replace a ConnectX-7 I/O card.
  • Page 76 Identify which I/O card to replace. Use the nvsm command or network tools to determine which card failed. After you have this information, contact NVIDIA Enterprise Support to get a replacement. When the new card arrives, power off the system.
  • Page 77 NVIDIA DGX B200 Service Manual Before you pull the card too far, remove the white and black IPEX cables from the card. The white cable connects to the top of the card and the black cable connects to the bottom...
  • Page 78 NVIDIA DGX B200 Service Manual Lift the locking door: Push the cable away from the connector: 12.6. Install ConnectX-7 Card Attach the IPEX cables following the instructions in the figure: The white cable connects to the top of the card and the black cable connects to the bottom (heatsink) of the card.
  • Page 79 NVIDIA DGX B200 Service Manual 12.7. Insert an IPEX Cable Repeat this process for both white and black cables. Align the IPEX cable to the connector: Press the cable into the connector: Confirm the cable is in the connector: 12.7. Insert an IPEX Cable...
  • Page 80 NVIDIA DGX B200 Service Manual Close the latching mechanism: Make sure the cable is locked to the connector on the board: 12.8. Install the I/O Card above the ConnectX-7 Card Reinstall the I/O card that is above the ConnectX-7 card. Refer to one of the two following pro- cedures: ▶...
  • Page 81 Updating the ConnectX-7 Firmware. Use the nvsm command to confirm that the system is operating correctly: sudo nvsm show health Send the failed unit to NVIDIA Enterprise Support using the packaging provided. 12.9. Power on the System and Confirm the Replacement...
  • Page 82 NVIDIA DGX B200 Service Manual Chapter 12. ConnectX-7 I/O Replacement...
  • Page 83 Chapter 13. Network Interface Card Replacement 13.1. Network Card Replacement Overview This is a high-level overview of the procedure to replace one or more network cards on the NVIDIA DGX™ B200 system. Identify the failed card. Get a replacement Ethernet card from NVIDIA Enterprise Support.
  • Page 84 NVIDIA DGX B200 Service Manual Power off the system. Open the motherboard tray I/O door to access the rear section of the motherboard. Refer to Motherboard Tray - Opening and Closing the I/O Door for more information. 13.3. Remove the Non-Functional Card...
  • Page 85 NVIDIA DGX B200 Service Manual Remove the card from the system: 13.4. Install the New Card and Close the Lock Insert the new card into the upper PCI slot: Turn the locking mechanism to secure the PCI cards: 13.4. Install the New Card and Close the Lock...
  • Page 86 Check for network connectivity on the replacement card. Confirm that the system is operating correctly by running the nvsm command: sudo nvsm show health Send the failed unit to NVIDIA Enterprise Support using the packaging provided. Chapter 13. Network Interface Card Replacement...
  • Page 87 Firmware After replacing or installing the ConnectX-7 cards, make sure the firmware on the cards is up to date. Refer to the NVIDIA DGX B200 Firmware Update Guide to find the most recent firmware version. Download the firmware from https://network.nvidia.com/support/firmware/connectx7ib/.
  • Page 88 NVIDIA DGX B200 Service Manual $ cat ∕sys∕class∕infiniband∕mlx5_*∕fw_ver Chapter 14. Updating the ConnectX-7 Firmware...
  • Page 89 Chapter 15. BlueField-3 I/O Card Replacement This topic describes how to replace the NVIDIA® BlueField®-3 card in the NVIDIA DGX™ B200 system. 15.1. BlueField-3 I/O Card Replacement Overview Identify the failed BlueField-3 I/O card. Get a replacement BlueField-3 I/O card from NVIDIA Enterprise Support.
  • Page 90 15.2. Prepare the System for Replacement Identify which I/O card to replace. Use the nvsm command or network tools to determine the failed card, and then contact NVIDIA Enterprise Support to get a replacement card. When you receive the replacement card, power off the system.
  • Page 91 NVIDIA DGX B200 Service Manual Before pulling the card too far, ensure to unplug the white and black IPEX cables from the card following the instructions in Remove an IPEX Cable. The white cable connects to the top of the card and the black cable connects to the bottom...
  • Page 92 NVIDIA DGX B200 Service Manual 15.5. Remove an IPEX Cable Repeat this procedure for both the white and black cables. The following image shows the IPEX cable attached to the connector: Lift the locking door: Push the cable away from the connector:...
  • Page 93 NVIDIA DGX B200 Service Manual 15.6. Install the BlueField-3 Card After you connect the IPEX cables, install the new BlueField-3 card in the bottom slot in the PCI riser: Attach the IPEX and power cables as shown in Insert an IPEX Cable.
  • Page 94 NVIDIA BlueField documentation. Confirm that the system is working correctly by using the nvsm command: sudo nvsm show health Use the packaging from the new card to send the failed card to NVIDIA Enterprise Support. Chapter 15. BlueField-3 I/O Card Replacement...
  • Page 95 16.1. DIMM Upgrade Procedure To upgrade DIMMs, Contact NVIDIA to obtain the complete upgrade kit. Replace all DIMMs following the instructions in the DIMM Replacement section. 16.2. DIMM Replacement Overview This is a high-level overview of the procedure to replace a dual inline memory module (DIMM) on the NVIDIA DGX™...
  • Page 96 NVIDIA DGX B200 Service Manual Power on the system. Verify that all DIMMs are now healthy with the nvsm health command. Send the failed unit to NVIDIA Enterprise Support using the packaging provided. Note You should observe the following DIMM population guidelines: ▶...
  • Page 97 NVIDIA DGX B200 Service Manual To remove the failed DIMM, press down on the ejection levers to eject the DIMM out of the socket. To insert the new DIMM, position it in the socket and press down until the levers close and the DIMM clicks into place.
  • Page 98 Plug in all cables. Install all power cords. Power on system. Log in and use the nvsm command to confirm the system is healthy: sudo nvsm show health Send the failed DIMM to NVIDIA Enterprise Support. Chapter 16. DIMM Upgrade and Replacement...
  • Page 99 17.1. Motherboard Tray Battery Replacement Overview This is a high-level overview of the procedure to replace the motherboard tray battery of the NVIDIA DGX™ B200 system. Purchase a CR2032 battery. Shut down the system.
  • Page 100 Call NVIDIA Enterprise Support to confirm that the battery is the right component to replace. Note NVIDIA does not provide the CR2032 battery, which can be found at a convenience store. After you purchase a battery, perform the following procedures.
  • Page 101 NVIDIA DGX B200 Service Manual Pull the PCI Ethernet card from the slot in the riser: Remove the card and prepare the ConnectX-7 card by identifying the IPEX cables that should be removed: 17.4. Remove the PCI Ethernet Card...
  • Page 102 NVIDIA DGX B200 Service Manual 17.5. Remove the BlueField-3 Card Remove the power cable from the BlueField-3 card side only: Do not unplug the power cable from the motherboard side. Before pulling the card too far, ensure to unplug the white and black IPEX cables from the card...
  • Page 103 NVIDIA DGX B200 Service Manual 17.6. Remove an IPEX Cable Repeat this procedure for both the white and black cables. The following image shows the IPEX cable attached to the connector: Lift the locking door: Push the cable away from the connector:...
  • Page 104 NVIDIA DGX B200 Service Manual 17.7. Replace the Battery Use a thin tool to lift the battery from the battery holder gently: Rotate the battery as shown in the following figure: Replace the battery with a new CR2032, installing it in the battery holder. Make sure the positive side is on top: Chapter 17.
  • Page 105 NVIDIA DGX B200 Service Manual 17.8. Install the BlueField-3 Card After you connect the IPEX cables, install the new BlueField-3 card in the bottom slot in the PCI riser: Attach the IPEX and power cables as shown in Insert an IPEX Cable.
  • Page 106 NVIDIA DGX B200 Service Manual Connect one end of the power cable to the BlueField-3 card and the other end of the power cable to the motherboard. Insert the BlueField-3 card in the bottom PCI slot: 17.9. Insert an IPEX Cable...
  • Page 107 NVIDIA DGX B200 Service Manual Close the latching mechanism: Ensure the cable is locked to the connector on the board: 17.10. Install the PCI Ethernet Card Position the card in the system: 17.10. Install the PCI Ethernet Card...
  • Page 108 NVIDIA DGX B200 Service Manual Push the card into the PCI slot: Close the latch to lock the PCI cards in place: Chapter 17. Motherboard Tray Battery Replacement...
  • Page 109 NVIDIA DGX B200 Service Manual Secure the locking mechanism by tightening the black thumbscrew: 17.11. Power On the System and Confirm the Replacement Close the motherboard tray I/O door and insert the motherboard tray. Refer to Motherboard Tray - Opening and Closing the I/O Door for more information.
  • Page 110 NVIDIA DGX B200 Service Manual To restore the date on the system, manually set the date using NTP: sudo date [MMDDhhmm[[CC]YY][.ss]] Sync the date and time to the hardware real-time clock: sudo hwclock -w Reset the BMC: sudo ipmitool mc reset cold...
  • Page 111 18.1. Trusted Platform Module Replacement Overview This is a high-level overview of the procedure to replace the trusted platform module (TPM) on the NVIDIA DGX™ B200 system. If enabled, disable drive encryption. Shut down the system. Label all motherboard tray cables and unplug them.
  • Page 112 NVIDIA DGX B200 Service Manual 18.2. Prepare the System for Replacement Obtain a new TPM from NVIDIA. If data drives are encrypted, the tpm2 OS package is installed, and the TPM is enabled in SBIOS, disable encryption: sudo nv-disk-encrypt disable Shut down the system.
  • Page 113 NVIDIA DGX B200 Service Manual Rotate the OSFP carrier module to access the TPM, as shown in the following diagram: Replace the TPM. Ensure that you position the TPM in the same direction as the original. 18.3. Replace the TPM Module...
  • Page 114 NVIDIA DGX B200 Service Manual 18.4. Install the OSFP Carrier Module Rotate the OSFP carrier module to return it to its original position. While you rotate the module, pull the module toward the DIMMs so that the ports do not interfere with the motherboard tray...
  • Page 115 NVIDIA DGX B200 Service Manual If data drives were encrypted, the tpm2 OS package was installed, and the TPM was enabled in SBIOS before the replacement, enable encryption: sudo nv-disk-encrypt init -g -r -k <your vault password> Use the nvsm command to confirm the system is healthy: sudo nvsm show health 18.5.
  • Page 116 NVIDIA DGX B200 Service Manual Chapter 18. Trusted Platform Module Replacement...
  • Page 117 Chapter 19. Rack Mount Kit Replacement This is a high-level overview of the procedure to replace a rack mount kit on the NVIDIA DGX™ B200 system. Remove the two front screws and washers. Remove the two rear screws. Use the clips to release the front and rear from each side of the kit.
  • Page 118 NVIDIA DGX B200 Service Manual On the lower part, a lip labeled 1, when installed in a rack, will hold the system in place like a shelf. On either end, spring-loaded prongs, as labeled 2 on the diagram, fit into the rack’s holes (either square or round.)
  • Page 119 NVIDIA DGX B200 Service Manual Remove the rail from the front post and hold it in place while the rear is released. Remove all cage nuts from the rack posts so they can be used during installation. 19.3. Remove Rack Mount Kit - Rear To release the rear of the rack mount kit, remove the round head screw and keep next to the other screws and washers.
  • Page 120 NVIDIA DGX B200 Service Manual Pull on the metal clip and slide the rail away from the post so the progs are free from the rack. Chapter 19. Rack Mount Kit Replacement...
  • Page 121 NVIDIA DGX B200 Service Manual 19.4. Confirm Necessary Screws and Washers These items are in the rack mount kit box with the rack mount kit. All these components should have been removed from the previous installation. Note Front screws are different from the screws used for the back of the rack mount kit. If the correct screws are not used in the front, the server will not be flush when pushed against the rack and it will be difficult to secure the other eight captive screws.
  • Page 122 NVIDIA DGX B200 Service Manual 19.5. Install Cage Nuts Using Template A printed copy of this template is included as part of the rack kit, and it should be used to align the desired location of the system to where the included cage nuts should be installed.
  • Page 123 NVIDIA DGX B200 Service Manual being installed in the frontmost post. Use a third pair of cage nuts so the bottom system screws have something to engage with. 19.6. Install Rack Mount Kit - Front You can start with either side to install the rack mount kit on the rack. The following instructions describe the installation of the left side.
  • Page 124 NVIDIA DGX B200 Service Manual to secure the rack mount kit to the post. 19.7. Install Rack Mount Kit - Rear To install the rear section of the rack mount kit, follow the same steps to align the bottom lip to the bottom of where the system should be.
  • Page 125 NVIDIA DGX B200 Service Manual Install the round head screw in the rack mount kit to secure it to the post. Repeat the procedure for the right side of the rack mount kit. 19.7. Install Rack Mount Kit - Rear...
  • Page 126 NVIDIA DGX B200 Service Manual Chapter 19. Rack Mount Kit Replacement...
  • Page 127 Chapter 20. Safety This section provides information about how to safely use the NVIDIA DGX™ B200 system. 20.1. Safety Information To reduce the risk of bodily injury, electrical shock, fire, and equipment damage, read this document and observe all warnings and precautions in this guide before installing or maintaining your server product.
  • Page 128 NVIDIA DGX B200 Service Manual Symbol Description Indicates potential hazard if indicated information is ignored. Indicates shock hazards that result in serious injury or death if safety instructions are not followed. Indicates hot components or surfaces. Indicates do not touch fan blades, may result in injury.
  • Page 129 ITE application, may require further evaluation. 20.4. Site Selection Here is some information about how to select the correct site for the DGX B200 system. Choose a site that is: 20.3. Intended Application Uses...
  • Page 130 NVIDIA DGX B200 Service Manual ▶ Clean, dry, and free of airborne particles (other than normal room dust). ▶ Well-ventilated and away from sources of heat including direct sunlight and radiators. ▶ Away from sources of vibration or physical shock.
  • Page 131 NVIDIA DGX B200 Service Manual 20.6.2. Power Cord Warnings Caution To avoid electrical shock or fire, check the power cord(s) that will be used with the product as follows: ▶ Do not attempt to modify or use the AC power cord(s) if they are not the exact type required to fit into the grounded electrical outlets.
  • Page 132 NVIDIA DGX B200 Service Manual Caution To avoid injury do not contact moving fan blades. Your system is supplied with a guard over the fan, do not operate the system without the fan guard in place. 20.8. Rack Mount Warnings The following installation guidelines are required by UL to maintain safety compliance when installing your system into a rack.
  • Page 133 Special handling may apply. See www.dtsc.ca.gov/hazardouswaste/perchlorate. 20.10.2. NICKEL NVIDIA Bezel. The bezel’s decorative metal foam contains some nickel. The metal foam is not intended for direct and prolonged skin contact. Please use the handles to remove, attach or carry the bezel.
  • Page 134 NVIDIA DGX B200 Service Manual 20.10.4. Cooling and Airflow Caution Carefully route cables as directed to minimize airflow blockage and cooling problems. For proper cooling and airflow, operate the system only with the chassis covers installed. Operating the system without the covers in place can damage system parts. To install the covers: ▶...
  • Page 135 Chapter 21. Compliance The NVIDIA DGX™ B200 Server is compliant with the regulations listed in this section. 21.1. United States Federal Communications Commission (FCC) FCC Marking (Class A) This device complies with part 15 of the FCC Rules. Operation is subject to the following two condi- tions: (1) this device may not cause harmful interference, and (2) this device must accept any inter- ference received, including any interference that may cause undesired operation of the device.
  • Page 136 ▶ Energy-related Products Directive (ErP). For the full text of EU declaration of conformity, refer to http://www.nvidia.com/support. A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA GmbH (Bavaria Towers – Blue Tower, Einsteinstrasse 172, D-81677 Munich, Germany).
  • Page 137 A Japanese regulatory requirement, defined by specification JIS C 0950, 2008, mandates that manu- facturers provide Material Content Declarations for certain categories of electronic products offered for sale after July 1, 2006. To view the JIS C 0950 material declaration for this product, visit www.nvidia.com. 21.6. Brazil...
  • Page 138 NVIDIA DGX B200 Service Manual Japan RoHS Material Content Declaration Chapter 21. Compliance...
  • Page 139 NVIDIA DGX B200 Service Manual 21.8. South Korea Korea Certification (KC) 21.9. China China Compulsory Certificate No certification is needed for China. The NVIDIA DGX B200 is a server with rated current over than 6A. 21.8. South Korea...
  • Page 140 NVIDIA DGX B200 Service Manual China RoHS Material Content Declaration 21.10. Taiwan Chapter 21. Compliance...
  • Page 141 NVIDIA DGX B200 Service Manual Bureau of Standards, Metrology & Inspection (BSMI) Taiwan RoHS Material Content Declaration 21.10. Taiwan...
  • Page 142 NVIDIA DGX B200 Service Manual 21.11. Russia/Kazakhstan/Belarus Customs Union Technical Regulations (CU TR) This device complies with the technical regulations of the Customs Union (CU TR) ТЕХНИЧЕСКИЙ РЕГЛАМЕНТ ТАМОЖЕННОГО СОЮЗА О безопасности низковольтного оборудования (ТР ТС 004/2011) ТЕХНИЧЕСКИЙ РЕГЛАМЕНТ ТАМОЖЕННОГО...
  • Page 143 NVIDIA DGX B200 Service Manual Bureau of India Standards (BIS) Authenticity may be verified by visiting the Bureau of Indian Standards website at http://www.bis.gov. India RoHS Compliance Statement This product, as well as its related consumables and spares, complies with the reduction in hazardous substances provisions of the “India E-waste (Management and Handling) Rule 2016”.
  • Page 144 SI 2012/3032: The Restriction of the Use of Certain Hazardous Substances in Electrical and Elec- tronic Equipment (As Amended) A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA Ltd. (100 Brook Drive, 3rd Floor Green Park, Reading RG2 6UJ, United Kingdom) Chapter 21. Compliance...
  • Page 145 Chapter 22. Third-Party License Notices This NVIDIA product contains third party software that is being made available to you under their re- spective open source software licenses. Some of those licenses also require specific legal information to be included in the product. This section provides such information.
  • Page 146 NVIDIA DGX B200 Service Manual HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Because some jurisdictions prohibit the exclusion or limitation of liability for consequential or incidental damages, the above limitation may not apply to you. TERMINATION OF THIS LICENSE: MTI may terminate this license at any time if you are in breach of any of the terms of this Agreement.
  • Page 147 NVIDIA accepts no liability related to any default, damage, costs, or prob- lem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
  • Page 148 OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WAR- RANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR CON-...