Download Print this page

Advertisement

Quick Links

NVIDIA DGX B200 User Guide
NVIDIA Corporation
Jan 24, 2025

Advertisement

loading
Need help?

Need help?

Do you have a question about the DGX B200 and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for Nvidia DGX B200

  • Page 1 NVIDIA DGX B200 User Guide NVIDIA Corporation Jan 24, 2025...
  • Page 3: Table Of Contents

    DGX B200 System Topology ........
  • Page 4 System Security Measures ........59 8.2.1 Secure Flash of DGX B200 Firmware ......60 8.2.2 Encryption .
  • Page 5 9 Redfish APIs Support Supported Redfish Features ........63 Connectivity Between the Host and BMC .
  • Page 6 11.12 Israel ........... . 105 11.13 India .
  • Page 7 NVIDIA DGX B200 User Guide The NVIDIA DGX B200 System User Guide is also available as a PDF. Contents...
  • Page 8 NVIDIA DGX B200 User Guide Contents...
  • Page 9: Chapter 1. Introduction To Nvidia Dgx B200 Systems

    Chapter 1. Introduction to NVIDIA DGX B200 Systems The NVIDIA DGX™ B200 System is the universal system purpose-built for all AI infrastructure and workloads from analytics to training to inference. The system is built on eight NVIDIA B200 Tensor Core GPUs. 1.1. Hardware Overview 1.1.1.
  • Page 10: Mechanical Specifications

    NVIDIA DGX B200 User Guide Table 1: Component Description Component Description 8 x NVIDIA B200 GPUs that provide 1,440 GB total GPU memory 2 x Intel Xeon 8570 PCIe Gen5 CPUs with 56 cores each 2.1/4 GHz (Base/Max boost) NVSwitch 2 x 5th generation NVLink switches that provide 14.4 TB/s ag-...
  • Page 11: Power Specifications

    Do not operate the system with PSUs depopulated. 1.1.4. DGX B200 Locking Power Cord Specification The DGX B200 system is shipped with a set of six (6) locking power cords that have been qualified for use with the DGX B200 system to ensure regulatory compliance.
  • Page 12: Using The Locking Power Cords

    Locking/Unlocking the PSU Side (Cords with Twist-Lock Mechanism) Power Supply (System) side - Twist locking ▶ To INSERT or REMOVE, ensure the cable is UNLOCKED and push/ pull into/out of the socket. Chapter 1. Introduction to NVIDIA DGX B200 Systems...
  • Page 13: Environmental Specifications

    Heat Output 48,794 BTU/hr 1.1.7. Front Panel Connections and Controls This section provides information about the front panel, connections, and controls of the DGX B200 system. 1.1.7.1 With a Bezel Here is an image of the DGX B200 system with a bezel.
  • Page 14: With The Bezel Removed

    Fault LED Amber On: System or component faulted 1.1.7.2 With the Bezel Removed Here is an image of the DGX B200 system without a bezel. Important Refer to the section First Boot Setup for instructions on how to properly turn the system on or off.
  • Page 15: Rear Panel Modules

    NVIDIA DGX B200 User Guide 1.1.8. Rear Panel Modules Here is an image that shows the actual panel modules on DGX B200. 1.1.9. Motherboard Connections and Controls The following image shows the motherboard connections and controls in a DGX B200 system. 1.1. Hardware Overview...
  • Page 16: Motherboard Tray Components

    1.1.10. Motherboard Tray Components The following image shows the motherboard tray components in the DGX B200 system. 1.1.11. GPU Tray Components Here is an image of the GPU tray components in the DGX B200 system. Chapter 1. Introduction to NVIDIA DGX B200 Systems...
  • Page 17: Network Connections, Cables, And Adaptors

    NVIDIA DGX B200 User Guide 1.2. Network Connections, Cables, and Adaptors This section provides information about network connections, cables, and adaptors. 1.2.1. Network Ports Here is an image that shows the network ports on a DGX B200 system. 1.2. Network Connections, Cables, and Adaptors...
  • Page 18: Compute And Storage Networking

    Slot2 P1 29:00.0 ibp41s0f0 enp41s0f0np0 mlx5_1 Slot2 P2 29:00.1 enp41s0f1np1 ibp41s0f1np1 mlx5_2 Slot3 P1 82:00.0 ens6f0 irdma0 Slot3 P2 82:00.1 ens6f1 irdma1 On-board 0b:00.0 eno3 1.2.2. Compute and Storage Networking Chapter 1. Introduction to NVIDIA DGX B200 Systems...
  • Page 19: Network Modules

    ▶ Consolidates four ConnectX-7 networking cards into a single device The DGX B200 system has eight ConnectX-7 network cards on two network module trays. Internal DensiLink cables connect the dual-port OSFP interface to the individual ConnectX-7 network card. Table 6: Network Modules...
  • Page 20: Bmc Port Leds

    The LED on the right is green and flashes to indicate activity. 1.2.5. Supported Network Cables and Adaptors The DGX B200 system is not shipped with network cables or adaptors. You will need to purchase supported cables or adaptors for your network.
  • Page 21: Dgx Os Software

    (daemon for managing cache data storage) 1.5. Customer Support Contact NVIDIA Enterprise Support for assistance in reporting, troubleshooting, or diagnosing prob- lems with your DGX B200 system. You can also contact NVIDIA Enterprise Support for help in moving the DGX B200 system. ▶...
  • Page 22 NVIDIA DGX B200 User Guide Chapter 1. Introduction to NVIDIA DGX B200 Systems...
  • Page 23: Chapter 2. Connecting To Dgx B200

    Connect to the DGX B200 console using either a direct connection or a remote connection through the BMC. Important Connect directly to the DGX B200 console if the NVIDIA DGX™ B200 system is connected to a 172.17.xx.xx subnet. DGX OS Server software installs Docker Engine, which uses the 172.17.xx.xx subnet by default for Docker containers.
  • Page 24: Remote Connection Through The Bmc

    NVIDIA DGX B200 User Guide 2.1.2. Remote Connection through the BMC Here is some information about remotely connecting to DGX B200 through the BMC. NVIDIA recommends that customers follow best security practices for BMC management (IPMI port). These include, but are not limited to, such measures as: ▶...
  • Page 25 Username: <administrator-username> ▶ Password: <bmc-password> Ensure you have connected the BMC port on the DGX B200 system to your LAN. Open a browser within your LAN and go to https:∕∕<bmc-ip-address>∕ Ensure popups are allowed for the BMC address. Log in.
  • Page 26: Ssh Connection To The Os

    NVIDIA DGX B200 User Guide 2.2. SSH Connection to the OS After configuring the system, you can establish an SSH connection to the DGX B200 OS through the network port. Refer to Network Ports to identify the port to use.
  • Page 27: Chapter 3. First Boot Setup

    3.1. System Setup These instructions describe the setup process that occurs the first time the DGX B200 system is powered on after delivery or after the server is re-imaged. Be prepared to accept all End User License Agreements (EULAs) and to set up your username and pass- word.
  • Page 28 NVIDIA DGX B200 User Guide ▶ Using the Remote BMC Refer to First Boot Process for DGX Servers in the NVIDIA DGX OS 7 User Guide for information about the following topics: ▶ Optionally encrypt the root file system. ▶...
  • Page 29: Post Setup Tasks

    3.2.2. Enabling the SRP Daemon The NVIDIA networking drivers provide the SRP daemon software. The daemon is disabled by default. Enabling the daemon is required if you want to use RDMA over InfiniBand. You can enable the daemon...
  • Page 30 NVIDIA DGX B200 User Guide Chapter 3. First Boot Setup...
  • Page 31: Chapter 4. Quickstart And Basic Operation

    NVIDIA DGX Platform page. 4.1. Installation and Configuration Before you install DGX B200, ensure you have given all relevant site information to your Installation Partner. Important Your DGX B200 system must be installed by NVIDIA partner network personnel or NVIDIA field service engineers.
  • Page 32: Obtaining An Ngc Account

    Observe the following startup and shutdown instructions. 4.4.1. Startup Considerations To keep your DGX B200 running smoothly, allow up to a minute of idle time after reaching the login prompt. This ensures that all components can complete their initialization.
  • Page 33: Running The Pre-Flight Test

    → The preceding command pulls the nvidia∕cuda container image layer by layer, then runs the nvidia-smi command. When complete, the output shows the NVIDIA Driver version and a description of each installed GPU. For more information, refer to Containers For Deep Learning Frameworks User Guide.
  • Page 34: Running Ngc Containers With Gpu Support

    20 minutes. sudo nvsm stress-test --force 4.7. Running NGC Containers with GPU Support To obtain the best performance when running NGC containers on the DGX B200 system, the following methods of providing GPU support for Docker containers are available: ▶...
  • Page 35: Using The Nvidia Container Runtime For Docker

    GPU-accelerated containers using this command and the new runtime will be used. ▶ Use docker run with nvidia as the default runtime. You can set nvidia as the default runtime, for example, by adding the following line to the ∕ etc∕docker∕daemon.json configuration file as the first entry. "default-runtime": "nvidia", Here is an example of how the added line appears in the JSON file.
  • Page 36: Managing Cpu Mitigations

    NVIDIA DGX B200 User Guide 4.8. Managing CPU Mitigations DGX OS Server includes security updates to mitigate CPU speculative side-channel vulnerabilities. These mitigations can decrease the performance of deep learning and machine learning workloads. If your installation of DGX systems incorporates other measures to mitigate these vulnerabilities, such as measures at the cluster level, you can disable the CPU mitigations for individual DGX nodes and thereby increase performance.
  • Page 37: Disabling Cpu Mitigations

    NVIDIA DGX B200 User Guide 4.8.2. Disabling CPU Mitigations Caution Performing the following instructions will disable the CPU mitigations provided by the DGX OS Server software. Install the nv-mitigations-off package. sudo apt install nv-mitigations-off -y Reboot the system. Verify CPU mitigations are disabled.
  • Page 38 NVIDIA DGX B200 User Guide Chapter 4. Quickstart and Basic Operation...
  • Page 39: Chapter 5. Sbios Settings

    Instructions for these use cases are provided in this section. Important Do not change settings in the SBIOS other than those described in this or other DGX B200 user documents. Contact NVIDIA Enterprise Services before making other changes. 5.1. Accessing the SBIOS Setup Here is information about how you can access the SBIOS setup.
  • Page 40: Configuring The Boot Order

    The following instructions describe how to set the boot order at boot time. You can also set the boot order from the SBIOS setup > Boot screen. Access the DGX B200 console from a locally connected keyboard and mouse or through the BMC remote console.
  • Page 41 NVIDIA DGX B200 User Guide Select the boot device. The following figure shows the virtual media selected. 5.2. Configuring the Boot Order...
  • Page 42: Configuring The Local Terminal

    Connect to the BMC web interface and click power on/reboot. From an operating system command line, run sudo reboot. ▶ Connect to the DGX B200 SOL console: ▶ Using SSH Create a new user using the BMC Web UI, for example, userA.
  • Page 43 NVIDIA DGX B200 User Guide Press the Del or F2 key when the system is booting. The system confirms your choice and shows the BIOS configuration screen. 5.4. Power on or Reboot the System...
  • Page 44 NVIDIA DGX B200 User Guide Chapter 5. SBIOS Settings...
  • Page 45: Chapter 6. Using The Baseboard Management Controller (Bmc)

    6.1. Connecting to the BMC Here are the steps to connect to the BMC on a DGX B200 system. Before you begin, ensure that you connect the BMC network interface controller port on the DGX system to your LAN.
  • Page 46: Overview Of Bmc Controls

    NVIDIA DGX B200 User Guide 6.2. Overview of BMC Controls The left-side navigation menu bar on the BMC main page contains the primary controls. Chapter 6. Using the Baseboard Management Controller (BMC)
  • Page 47 NVIDIA DGX B200 User Guide 6.2. Overview of BMC Controls...
  • Page 48 NVIDIA DGX B200 User Guide Table 1: BMC Main Controls Control Description Quick Links Provide quick access to several tasks. Dashboard Display the overall information about the status of the device. Sensor Provide status and readings for system sensors, such as SSD, PSUs, voltages, CPU temperatures, DIMM temperatures, and fan speeds.
  • Page 49: Open Ports

    NVIDIA DGX B200 User Guide 6.3. Open Ports Ensure that the ports listed in the following table are open and available on your firewall to the DGX B200 system. Table 2: Open Ports Port Protocol Function HTTPS Web User Interface...
  • Page 50: Configuring A Bmc Static Address By Using Ipmitool

    This section describes how to set a static IP address for the BMC from the Ubuntu command line. Note If you cannot access the DGX B200 system remotely, connect a display (1440x900 or lower resolu- tion) and keyboard directly to the DGX B200 system.
  • Page 51: Changing The Bmc Login Credentials

    NVIDIA DGX B200 User Guide Scroll to the specific item and press Enter. Enter the appropriate information in the dialog, and then press Enter. When you finish making the changes, press F10 to save and exit. 6.5. Changing the BMC Login Credentials 6.5.1.
  • Page 52: Using The Remote Console

    NVIDIA DGX B200 User Guide 6.6. Using the Remote Console To use the remote console, perform the following steps: Click Remote Control from the left-side navigation menu. Click Launch KVM to start the remote KVM and access the DGX system console.
  • Page 53: Uploading Or Generating Ssl Certificates

    NVIDIA DGX B200 User Guide The Event Filters page shows all configured event filters and available slots. You can modify or add a new event filter entry on this page. ▶ To view available configured and unconfigured slots, click All in the upper-left corner of the page.
  • Page 54: Viewing The Ssl Certificate

    NVIDIA DGX B200 User Guide 6.9.1. Viewing the SSL Certificate To view the SSL certificate, on the SSL Setting page, click View SSL Certificate. The View SSL Certificate page displays the following basic information about the uploaded SSL cer- tificate: ▶...
  • Page 55: Uploading The Ssl Certificate

    NVIDIA DGX B200 User Guide Table 3: SSL Certificate Items Description and Requirements Common Name (CN) The common name for which the certificate is to be generated. ▶ Maximum length of 64 alphanumeric characters. ▶ Special characters ‘#’ and ‘$’ are not allowed.
  • Page 56: Updating The Sbios Certificate

    NVIDIA DGX B200 User Guide On the SSL Setting page, click Upload SSL Certificate. Click the New Certificate folder icon, browse to locate the appropriate file, and select it. Click the New Private Key folder icon, browse and locate the appropriate file, and select it.
  • Page 57 NVIDIA DGX B200 User Guide Select Server CA Configuration. Select Enroll Cert. 6.9. Uploading or Generating SSL Certificates...
  • Page 58 NVIDIA DGX B200 User Guide Select Enroll Cert Using File. Select the device where you stored the certificate. Navigate the file structure and select the certificate. Chapter 6. Using the Baseboard Management Controller (BMC)
  • Page 59 NVIDIA DGX B200 User Guide 6.9. Uploading or Generating SSL Certificates...
  • Page 60 NVIDIA DGX B200 User Guide Chapter 6. Using the Baseboard Management Controller (BMC)
  • Page 61: Chapter 7. Managing Power Capping

    The GPU has three sources of power limits: ▶ VBIOS: defines the maximum possible TGP (Total Graphics Power) value. The nvidia-smi tool: sets the power limit of the GPU through the host by users. ▶ ▶ SMBPBI: sets the power limit of the GPU via an out-of-band channel.
  • Page 62: Managing N+N Configuration (Ipmi)

    NVIDIA DGX B200 User Guide 7.2. Managing N+N Configuration (IPMI) By default, a system will boot with three power supplies. To achieve the safe operation of an N+N configuration, you need to enable the power capping feature to limit the power consumed by the system.
  • Page 63: Managing Power Capping Using Redfish Api

    NVIDIA DGX B200 User Guide 7.3. Managing Power Capping Using Redfish API To manage a system’s maximum power consumption through power capping using Redfish API, refer Querying GPU Power Limit Power Capping. 7.3. Managing Power Capping Using Redfish API...
  • Page 64 NVIDIA DGX B200 User Guide Chapter 7. Managing Power Capping...
  • Page 65: Chapter 8. Security

    This section provides information about security measures in the NVIDIA DGX™ B200 system. 8.1. User Security Measures The NVIDIA DGX B200 system is a specialized server designed to be deployed in a data center. It must be configured to protect the hardware from unauthorized access and unapproved use. The DGX B200 system is designed with a dedicated BMC Management Port and multiple Ethernet network ports.
  • Page 66: Secure Flash Of Dgx B200 Firmware

    8.3. Secure Data Deletion This section explains how to securely delete data from the DGX B200 system SSDs to destroy all the stored data permanently. This process performs a more secure SSD data deletion than merely deleting files or reformatting the SSDs.
  • Page 67: Procedure

    NVIDIA DGX B200 User Guide 8.3.2. Procedure Here are the instructions to securely delete data from the DGX B200 system SSDs. Boot the system from the ISO image, either remotely or from a bootable USB key. At the GRUB menu, select: ▶...
  • Page 68 NVIDIA DGX B200 User Guide Chapter 8. Security...
  • Page 69: Chapter 9. Redfish Apis Support

    The DGX System firmware supports Redfish APIs. Redfish is DMTF’s standard set of APIs for man- aging and monitoring a platform. Redfish support is enabled by default in the DGX B200 BMC and the BIOS. Using the Redfish interface, administrator-privileged users can browse physical resources at the chassis and system level through the REST API interface.
  • Page 70: Connectivity Between The Host And Bmc

    NVIDIA DGX B200 User Guide ▶ System/Chassis power operations ▶ Get health event log/advanced system event log ▶ Logging Service, which provides critical/informational severity events ▶ Event Services (SSE) ▶ Querying GPU power limit ▶ Power capping Refer to the following documentation for more information: ▶...
  • Page 71: Firmware Update

    NVIDIA DGX B200 User Guide ▶ Reset BMC The following curl command forces a reset of the DGX B200 BMC. curl -k -u <bmc-user>:<password> --request POST --location 'https:∕∕<bmc-ip- address>∕redfish∕v1∕Managers∕BMC∕Actions∕Manager.Reset' --header 'Content- → Type: application∕json' --data '{"ResetType": "ForceRestart"}' → ▶ Reset BMC to factory defaults The following curl command resets the BMC to factory defaults.
  • Page 72 "Version": "0.2.0.7" ∕∕ ... ▶ Update GPU tray components To update the GPU tray components in your DGX B200 system, specify HGX_0 as the target regardless of the GPU tray component that you want to update. echo "{\"Targets\":[\"∕redfish∕v1∕UpdateService∕FirmwareInventory∕HGX_0\"]}" > parameters.json →...
  • Page 73: Bios Settings

    ▶ Forced Update The DGX B200 system component firmware is only updated if the incoming firmware version is newer than the existing version. To override this behavior and flash the component, specify the ForceUpdate field and set it to true.
  • Page 74: Modifying The Boot Order Using Redfish

    BIOS update, an additional power cycle is needed to apply the changes. 9.3.4. Modifying the Boot Order Using Redfish To modify the boot order on DGX B200 systems using Redfish APIs, follow the steps described in this procedure.
  • Page 75 → "@odata.etag": "\"1696896625\"", "DisplayName": "UEFI: PXE IPv4 Intel(R) Ethernet Controller X550", "Name": "Boot0004", "UefiDevicePath": "PciRoot(0x0)∕Pci(0x10,0x0)∕Pci(0x0,0x0)∕MAC(5CFF35FBDA09,0x1)∕ IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)" → "@odata.etag": "\"1696896625\"", "DisplayName": "UEFI: PXE IPv4 Nvidia Network Adapter - B8:3F:D2:E7:B1:6C", "Name": "Boot0005", "UefiDevicePath": "PciRoot(0x20)∕Pci(0x1,0x0)∕Pci(0x0,0x0)∕Pci(0x0,0x0)∕Pci(0x0, 0x0)∕Pci(0x0,0x0)∕Pci(0x0,0x0)∕MAC(B83FD2E7B16C,0x1)∕IPv4(0.0.0.0,0x0,DHCP,0.0. → 0.0,0.0.0.0,0.0.0.0)" → "@odata.etag": "\"1696896625\"", "DisplayName": "UEFI: PXE IPv4 Nvidia Network Adapter - B8:3F:D2:E7:B1:6D", "Name": "Boot0006",...
  • Page 76 NVIDIA DGX B200 User Guide (continued from previous page) "@odata.etag": "\"1696896625\"", "DisplayName": "ubuntu", "Name": "Boot000F", "UefiDevicePath": "HD(1,GPT,1E0EFF2A-2BF3-4DC6-8757-4075B1E5343D,0x800,0x100000)∕\ \EFI\\UBUNTU\\SHIMX64.EFI" → "@odata.etag": "\"1696896625\"", "DisplayName": "UEFI: PXE IPv4 American Megatrends Inc.", "Name": "Boot0010", "UefiDevicePath": "PciRoot(0x0)∕Pci(0x14,0x0)∕USB(0xA,0x0)∕USB(0x2,0x1)∕ MAC(4E2A712C2451,0x0)∕IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)" → Where The DisplayName string is the name of the drive or network adapter.
  • Page 77: Telemetry

    NVIDIA DGX B200 User Guide (continued from previous page) "Boot0010" Upon reboot, the system should attempt to boot from the network using the correct network interface: This boot order change will remain until the next boot order update, which can be done by resetting the SBIOS or running this procedure again.
  • Page 78: Sel Logs

    NVIDIA DGX B200 User Guide ▶ Chassis Graceful Restart (IPMI chassis soft off, IPMI chassis power on) curl -k -u <bmc-user>:<password> --request POST --location 'https:∕∕<bmc-ip- address>∕redfish∕v1∕Systems∕DGX∕Actions∕ComputerSystem.Reset' --header 'Content- → Type: application∕json' --data '{"ResetType": "GracefulRestart"}' → ▶ Chassis Off (IPMI chassis power off) curl -k -u <bmc-user>:<password>...
  • Page 79: Backing Up And Restoring Bmc Configurations

    NVIDIA DGX B200 User Guide curl -k -u <bmc-user>:<password> --request POST --location 'https:∕∕{{bmc-ip- address}}∕redfish∕v1∕Managers∕Self∕VirtualMedia∕CD_1∕Actions∕VirtualMedia. → InsertMedia' --data-raw '{"Image" : "∕∕<serverip>∕home∕nvidia∕images∕ubuntu-20. → 04.2-live-server-amd64.iso","TransferProtocolType" : "NFS"}' → 9.3.9. Backing Up and Restoring BMC Configurations In addition to using the Web UI to back up and restore the BMC configuration, you can use Redfish API with the following approach: Install a security AES key in the BMC.
  • Page 80: Collecting Bmc Debug Data

    NVIDIA DGX B200 User Guide A successful command returns a 204 HTTP status code. Restore the BMC configuration from a backup file, for example, bmc-config.bak. curl -s -k -u <username>:<password> --location --request POST 'https:∕∕<bmcip>∕ redfish∕v1∕Managers∕BMC∕Actions∕Oem∕NvidiaManager.RestoreConfig' --form 'conf_ → file=@"bmc-config.bak"' | jq →...
  • Page 81: Clear Bios And Reset To Factory Defaults

    NVIDIA DGX B200 User Guide (continued from previous page) "Resolution": "None", "Severity": "Warning" "@odata.type": "#Message.v1_0_8.Message", "Message": "Task ∕redfish∕v1∕Managers∕BMC∕LogServices∕DiagnosticLog∕ Actions∕LogService.CollectDiagnosticData has completed.", → "MessageArgs": [ "∕redfish∕v1∕Managers∕BMC∕LogServices∕DiagnosticLog∕Actions∕ LogService.CollectDiagnosticData" → "MessageId": "Task.1.0.Completed", "Resolution": "None", "Severity": "OK" "Name": "Manager CollectDiagnosticData", "PercentComplete": 100, "StartTime": "2024-08-13T16:13:20+00:00", "TaskState": "Completed",...
  • Page 82: Power Capping

    NVIDIA DGX B200 User Guide As shown in the following example output, the Reading field indicates the current power usage, and the SetPoint field indicates the current GPU power limit. "PowerLimitWatts": { "AllowableMax": 700, "AllowableMin": 200, "ControlMode": "Automatic", "DefaultSetPoint": 700, "Reading": 64.388,...
  • Page 83 NVIDIA DGX B200 User Guide curl -k -u <bmc-user>:<password> https:∕∕<bmcip>∕redfish∕v1∕Managers∕BMC∕ NodeManager∕Domains → Example response: "@odata.context": "∕redfish∕v1∕$Metadata#NvidiaNmDomainCollection. NvidiaNmDomainCollection", → "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NvidiaNmDomainCollection", "@odata.type": "#NvidiaNmDomainCollection.NvidiaNmDomainCollection", "Members": [ "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕0" "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕1" "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕4" "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕2" "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕3" "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕5" "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕10" "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕11"...
  • Page 84 NVIDIA DGX B200 User Guide curl -k -u <bmc-user>:<password> https:∕∕<bmcip>∕redfish∕v1∕Managers∕BMC∕ NodeManager∕Domains∕<DomainID> → For example, to view policies in domain 0: curl -k -u <bmc-user>:<password> https:∕∕<bmcip>∕redfish∕v1∕Managers∕BMC∕ NodeManager∕Domains∕0 → Example response: "@odata.context": "∕redfish∕v1∕$Metadata#NvidiaNmDomain.NvidiaNmDomain", "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕0", "@odata.type": "#NvidiaNmDomain.v1_4_0.NvidiaNmDomain", "Capabilities": { "Max": 16500, "MaxCorrectionTimeInMs": 2000, "MaxStatisticsReportingPeriod": "2000",...
  • Page 85: Custom Policies

    NVIDIA DGX B200 User Guide PercentageofDomainBudget: How much of the budget can be allocated. ▶ Status: Whether the policy is in effect. This determined by the Domain Policy State ▶ In general, the algorithm always uses PercentageofDomainBudget. curl -k -u <bmc-user>:<password> https:∕∕<bmcip>∕redfish∕v1∕Managers∕BMC∕...
  • Page 86 NVIDIA DGX B200 User Guide (continued from previous page) "Min": 4000.0000 "Id": "0", "Name": "custom4", "Status": { "State": "Enabled" "Policies": { "@odata.context": "∕redfish∕v1∕$Metadata#NvidiaNmPolicyCollection. NvidiaNmPolicyCollection", → "@odata.type": "#NvidiaNmPolicyCollection.NvidiaNmPolicyCollection", "Members": [ "@odata.context": "∕redfish∕v1∕$Metadata#NvidiaNmPolicy.NvidiaNmPolicy", "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕0∕Policies∕0", "@odata.type": "#NvidiaNmPolicy.v1_2_0.NvidiaNmPolicy", "AssociatedDomainID": { "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕0" "ComponentId": "COMP_CPU", "Id": "0",...
  • Page 87 NVIDIA DGX B200 User Guide curl -k -u <bmc-user>:<password> -X POST https:∕∕<BMC>∕redfish∕v1∕Managers∕BMC∕ NodeManager∕Domains --data @<pathtojsonfile> → Example response: "@odata.context": "∕redfish∕v1∕$Metadata#NvidiaNmDomain.NvidiaNmDomain", "@odata.id": "∕redfish∕v1∕Managers∕BMC∕NodeManager∕Domains∕14", "@odata.type": "#NvidiaNmDomain.v1_4_0.NvidiaNmDomain", "Capabilities": { "Max": 9000, "MaxCorrectionTimeInMs": 0, "MaxStatisticsReportingPeriod": "0", "Min": 6000, "MinCorrectionTimeInMs": 0, "MinStatisticsReportingPeriod": "0" "Id": "14", "Name": "custom4",...
  • Page 88: Psu Policies

    NVIDIA DGX B200 User Guide 9.3.13.4 PSU Policies Power supply unit (PSU) policies are read-only. PSU Policies set the overall available power budget for the system based on the number of active power supplies. The PSU Policy in effect enforces how the Domain Policies are selected.
  • Page 89 NVIDIA DGX B200 User Guide (continued from previous page) "MinPSU": 2, "Name": "Limp", "Status": { "State": "Disabled" PSU policy 0 defines the number of PSUs and the power that will be allocated to the system with a maximum of two PSUs.
  • Page 90 NVIDIA DGX B200 User Guide (continued from previous page) "Name": "NvidiaNMMetrics_0" Table 1: Definitions of Metrics MetricId Definition Example Metric Value PSU_Active_Policy current active pol- correponding /red- fish/v1Managers/BMC/NodeManager/PSUPolicies/<PSU_Active_Policy> Domain_Policy_Active current active domain pol- corresponding /red- fish/v1/Managers/BMC/NodeManager/Domains/<Domain_Policy_Active> dcPlatformPower_avg Total DC Power for the Platform 2181.00...
  • Page 91 NVIDIA DGX B200 User Guide Table 1 – continued from previous page MetricId Definition Example Metric Value cpuEnergy_0 Energy for CPU 0 196.00 coreEfficiency_0 Core Efficiency for CPU 0 61671.00 cpuPackagePowerCapabil- Power Capabilities MIN CPU 0 itiesMin_0 cpuPackagePowerCapabil- Power Capabilities MAX CPU 0...
  • Page 92 NVIDIA DGX B200 User Guide Table 1 – continued from previous page MetricId Definition Example Metric Value turboRatioCapabili- Turbo Ratio Min Capabilities CPU 1 (Min Fre- tiesMin_1 quency) turboRatioCapabilities- Turbo Ratio Max Capabilities CPU 1 (Max Fre- 3800 Max_1 quency)
  • Page 93 NVIDIA DGX B200 User Guide Table 1 – continued from previous page MetricId Definition Example Metric Value gpuPowerCapabilities- GPU 2 Max Power Limit 700.00 Max_2 gpuPower_avg_3 GPU 3 Average Power 63.00 gpuPowerLimit_3 GPU 3 Power Limit 700.00 gpuPowerCapabili- GPU 3 Min Power Limit 200.00...
  • Page 94 NVIDIA DGX B200 User Guide Chapter 9. Redfish APIs Support...
  • Page 95: Chapter 10. Safety

    Chapter 10. Safety This section provides information about how to safely use the NVIDIA DGX™ B200 system. 10.1. Safety Information To reduce the risk of bodily injury, electrical shock, fire, and equipment damage, read this document and observe all warnings and precautions in this guide before installing or maintaining your server product.
  • Page 96 NVIDIA DGX B200 User Guide Symbol Description Indicates potential hazard if indicated information is ignored. Indicates shock hazards that result in serious injury or death if safety instructions are not followed. Indicates hot components or surfaces. Indicates do not touch fan blades, may result in injury.
  • Page 97: Intended Application Uses

    NVIDIA DGX B200 User Guide 10.3. Intended Application Uses This product was evaluated as Information Technology Equipment (ITE), which may be installed in of- fices, schools, computer rooms, and similar commercial type locations. The suitability of this product for other product categories and environments (such as medical, in- dustrial, residential, alarm systems, and test equipment), other than an ITE application, may require further evaluation.
  • Page 98: Site Selection

    NVIDIA DGX B200 User Guide 10.4. Site Selection Here is some information about how to select the correct site for the DGX B200 system. Choose a site that is: ▶ Clean, dry, and free of airborne particles (other than normal room dust).
  • Page 99: Power Cord Warnings

    NVIDIA DGX B200 User Guide When replacing a hot-plug power supply, unplug the power cord to the power supply being replaced before removing it from the server. To avoid risk of electric shock, tum off the server and disconnect the power cords, telecommunications systems, networks, and modems attached to the server before opening it.
  • Page 100: Rack Mount Warnings

    NVIDIA DGX B200 User Guide Caution If the server has been running, any installed processor(s) and heat sink(s) may be hot. Unless you are adding or removing a hot-plug component, allow the system to cool before opening the covers. To avoid the possibility of coming into contact with hot component(s) during a hot-plug installation, be careful when removing or installing the hot-plug component(s).
  • Page 101: Electrostatic Discharge

    Special handling may apply. See www.dtsc.ca.gov/hazardouswaste/perchlorate. 10.10.2. NICKEL NVIDIA Bezel. The bezel’s decorative metal foam contains some nickel. The metal foam is not intended for direct and prolonged skin contact. Please use the handles to remove, attach or carry the bezel.
  • Page 102: Battery Replacement

    NVIDIA DGX B200 User Guide 10.10.3. Battery Replacement Caution There is the danger of explosion if the battery is incorrectly replaced. When replacing the battery, use only the battery recommended by the equipment manufacturer. Dispose of batteries according to local ordinances and regulations. Do not attempt to recharge a battery.
  • Page 103: Chapter 11. Compliance

    Chapter 11. Compliance The NVIDIA DGX™ B200 Server is compliant with the regulations listed in this section. 11.1. United States Federal Communications Commission (FCC) FCC Marking (Class A) This device complies with part 15 of the FCC Rules. Operation is subject to the following two condi- tions: (1) this device may not cause harmful interference, and (2) this device must accept any inter- ference received, including any interference that may cause undesired operation of the device.
  • Page 104: Canada

    ▶ Energy-related Products Directive (ErP). For the full text of EU declaration of conformity, refer to http://www.nvidia.com/support. A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA GmbH (Bavaria Towers – Blue Tower, Einsteinstrasse 172, D-81677 Munich, Germany).
  • Page 105: Australia And New Zealand

    NVIDIA DGX B200 User Guide 11.5. Australia and New Zealand Australian Communications and Media Authority This product meets the applicable EMC requirements for Class A, I.T.E equipment. 11.6. Brazil INMETRO 11.7. Japan Voluntary Control Council for Interference (VCCI) This is a Class A product.
  • Page 106 A Japanese regulatory requirement, defined by specification JIS C 0950, 2008, mandates that manu- facturers provide Material Content Declarations for certain categories of electronic products offered for sale after July 1, 2006. To view the JIS C 0950 material declaration for this product, visit www.nvidia.com. Japan RoHS Material Content Declaration Chapter 11. Compliance...
  • Page 107: South Korea

    NVIDIA DGX B200 User Guide 11.8. South Korea Korea Certification (KC) 11.8. South Korea...
  • Page 108: China

    NVIDIA DGX B200 User Guide 11.9. China China Compulsory Certificate No certification is needed for China. The NVIDIA DGX B200 is a server with rated current over than 6A. China RoHS Material Content Declaration Chapter 11. Compliance...
  • Page 109: Taiwan

    NVIDIA DGX B200 User Guide 11.10. Taiwan Bureau of Standards, Metrology & Inspection (BSMI) 11.10. Taiwan...
  • Page 110: Russia/Kazakhstan/Belarus

    NVIDIA DGX B200 User Guide Taiwan RoHS Material Content Declaration 11.11. Russia/Kazakhstan/Belarus Customs Union Technical Regulations (CU TR) This device complies with the technical regulations of the Customs Union (CU TR) ТЕХНИЧЕСКИЙ РЕГЛАМЕНТ ТАМОЖЕННОГО СОЮЗА О безопасности низковольтного оборудования (ТР ТС 004/2011)
  • Page 111: Israel

    NVIDIA DGX B200 User Guide ТЕХНИЧЕСКИЙ РЕГЛАМЕНТ ТАМОЖЕННОГО СОЮЗА Электромагнитная совместимость технических средств (ТР ТС 020/2011) Технический регламент Евразийского экономического союза “Об ограничении применения опасных веществ в изделиях электротехники и радиоэлектроники” (ТР ЕАЭС 037/2016) Federal Agency of Communication (FAC) This device complies with the rules set forth by the Federal Agency of Communications and the Min- istry of Communications and Mass Media.
  • Page 112: South Africa

    SI 2012/3032: The Restriction of the Use of Certain Hazardous Substances in Electrical and Elec- tronic Equipment (As Amended) A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA Ltd. (100 Brook Drive, 3rd Floor Green Park, Reading RG2 6UJ, United Kingdom) Chapter 11. Compliance...
  • Page 113: Chapter 12. Third-Party License Notices

    Chapter 12. Third-Party License Notices This NVIDIA product contains third party software that is being made available to you under their re- spective open source software licenses. Some of those licenses also require specific legal information to be included in the product. This section provides such information.
  • Page 114: Mellanox (Ofed)

    NVIDIA DGX B200 User Guide INFORMATION) ARISING OUT OF YOUR USE OF OR INABILITY TO USE THE SOFTWARE, EVEN IF MTI HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. Because some jurisdictions prohibit the exclusion or limitation of liability for consequential or incidental damages, the above limitation may not apply to you.
  • Page 115: Chapter 13. Notices

    NVIDIA accepts no liability related to any default, damage, costs, or prob- lem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
  • Page 116: Trademarks

    OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WAR- RANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL NVIDIA BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, PUNITIVE, OR CON-...
Save PDF