Table of Contents

Advertisement

DGX Station A100
User Guide
DU-10189-001 _v5.0.2
  |  
March   2021

Advertisement

Table of Contents
loading

Summary of Contents for Nvidia DGX Station A100

  • Page 1 DGX Station A100 User Guide DU-10189-001 _v5.0.2   |   March   2021...
  • Page 2: Table Of Contents

    1.4. DGX Station A100 Hardware Summary................... 3 Chapter 2. Getting Started with DGX Station A100..............4 2.1. Connecting and Powering on the DGX Station A100...............4 2.2. Using DGX Station A100 as a Server Without a Monitor............8 2.3. Running Workloads on Systems with Mixed Types of GPUs..........10 2.3.1. Running with Docker Containers..................10 2.3.2. Running on Bare Metal....................11...
  • Page 3 5.9.2. Clearing the TPM......................35 5.10. Changing Disk Passwords, Adding Disks, or Replacing Disks........... 36 5.11. Recovering a Lost Key......................36 Chapter 6. Unpacking and Repacking the DGX Station A100..........37 6.1. Unpacking the DGX Station A100................... 37 6.2. Repacking the DGX Station A100 for Shipment..............40 Appendix A. Security......................44 Appendix ...
  • Page 4 D.14.  United Kingdom........................60 D.15.  United States........................61 D.16. United States/Canada......................62 Appendix E. DGX Station A100 Hardware Specifications...........63 E.1. Environmental Conditions...................... 63 E.2. Component Specifications......................63 E.3. Mechanical Specifications...................... 64 E.4. Power Specifications......................64 ™ Appendix F. Customer Support for the NVIDIA DGX Station A100........65 DGX Station A100 DU-10189-001 _v5.0.2   |   iv...
  • Page 5 List of Tables Table 1. BMC Navigation Controls ....................18 Table 2. Fields to Generate an SSL Certificate ................25 DGX Station A100 DU-10189-001 _v5.0.2   |   v...
  • Page 6: About This Guide

    Note: The instructions in this guide for software administration apply only to the DGX OS. They do not apply if the DGX OS software that is supplied with the DGX Station A100 has been replaced with the DGX software for Red Hat Enterprise Linux or CentOS.
  • Page 7: Chapter 1. Introduction To The Nvidia Dgx Station ™ A100

    AI appliance that you can place anywhere.     1.1.  Registering Your DGX Station A100 To obtain support for your DGX Station A100, follow the instructions for registration in the Entitlement Certification email that was sent as part of the purchase. DGX Station A100 DU-10189-001 _v5.0.2   |   1...
  • Page 8: What's In The Box

    1.3.  DGX OS Software Summary The DGX OS software that is supplied with the DGX Station A100 includes the software that you need for downloading and running containers for deep learning frameworks. The software is already installed on the DGX Station A100, except where licensing requirements mandate that the software be supplied separately.
  • Page 9: Dgx Station A100 Hardware Summary

    DGX Station A100 Hardware Summary Processors Component Description Single AMD 7742, 64 cores, and 2.25 GHz (base)–3.4 GHz (max boost) NVIDIA A100 with 80 GB per GPU (320 GB total) of GPU memory System Memory and Storage Unit Total Component Capacity Capacity...
  • Page 10: Chapter 2. Getting Started With Dgx Station A100

    Chapter 2. Getting Started with DGX Station A100 This section provides information about how to connect and power on the DGX Station A100. 2.1.  Connecting and Powering on the DGX Station A100 To complete this task you need the following items, which are not supplied with the DGX Station A100: ‣...
  • Page 11 Ubuntu OS configuration, you can configure the DGX Station A100 to use multiple displays. Refer to the NVDIA DGX OS 5 User Guide for more information. 2. Use any of the two Ethernet ports to connect the DGX Station A100 to your LAN with Internet connectivity.   DGX Station A100...
  • Page 12   Note: Remember the following information: ‣ Connect only one Ethernet port on the DGX Station A100 to the Internet unless you plan to configure the ports manually and disable DHCP on at least one of the ports. ‣ By default, both Ethernet ports on the DGX Station A100 are configured for DHCP. If both the ports are connected simultaneously, each port will get its own IP address.
  • Page 13 Getting Started with DGX Station A100 RESTRICTION: The power source for DGX Station A100 must be 100V and cannot fall below 90V. CAUTION: Remember the following information: ‣ Use only the supplied power cable and do not use this power cable with any other products or for any other purpose.
  • Page 14: Using Dgx Station A100 As A Server Without A Monitor

    Getting Started with DGX Station A100 7. Push the Power button on the front of the unit to power on the DGX Station A100.     2.2.  Using DGX Station A100 as a Server Without a Monitor By default, DGX Station A100 is shipped with the DP port automatically selected in the display.
  • Page 15 Getting Started with DGX Station A100 ‣ If you plan to use DGX Station A100 as a desktop system, use the information in this user guide to get started. You do not need to make changes to the SBIOS. ‣...
  • Page 16: Running Workloads On Systems With Mixed Types Of Gpus

    GPU 3: DGX Display (UUID: GPU-91b9d8c8-e2b9-6264-99e0-b47351964c52) GPU 4: Graphics Device (UUID: GPU-e32263f2-ae07-f1db-37dc-17d1169b09bf) A total of five GPUs are listed by nvidia-smi. This is because nvidia-smi is including the DGX Display GPU that is used to drive the monitor and high-quality graphics output.
  • Page 17: Running On Bare Metal

    CUDA_DEVICE_ORDER PCI_BUS_ID overridden. In the following example, a CUDA application that comes with CUDA samples is run. In the output, is the fastest in a DGX Station A100, and (DGX Display GPU) is the GPU 0 GPU 4 slowest: sudo apt install cuda-samples-11-2 lab@ro-dvt-058-80gb:~$ cd /usr/local/cuda-11.2/samples/1_Utilities/p2pBandwidthLatencyTest...
  • Page 18 Getting Started with DGX Station A100 lab@ro-dvt-058-80gb:/usr/local/cuda-11.2/samples/bin/x86_64/linux/release $ p2pBandwidthLatencyTest [P2P (Peer-to-Peer) GPU Bandwidth Latency Test] Device: 0, Graphics Device, pciBusID: 1, pciDeviceID: 0, pciDomainID:0 Device: 1, Graphics Device, pciBusID: 47, pciDeviceID: 0, pciDomainID:0 Device: 2, Graphics Device, pciBusID: 81, pciDeviceID: 0, pciDomainID:0...
  • Page 19 Getting Started with DGX Station A100 3 185.55 185.32 184.86 1589.52 15.71 16.26 16.28 16.16 15.69 139.43 P2P=Disabled Latency Matrix (us) 3.53 21.60 22.22 21.38 12.46 21.61 2.62 21.55 21.65 12.34 21.57 21.54 2.61 21.55 12.40 21.57 21.54 21.58 2.51 13.00...
  • Page 20 Getting Started with DGX Station A100 ***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure. So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases. P2P Connectivity Matrix Unidirectional P2P=Disabled Bandwidth Matrix (GB/s) 0 1324.15...
  • Page 21: Using Multi-Instance Gpus

    2.3.3.  Using Multi-Instance GPUs Multi-Instance GPUs (MIG) is a technology that is available on NVIDIA A100 GPUs. If MIG is enabled on the GPUs and if the GPUs have been partitioned already, then applications can be limited to run on these devices.
  • Page 22: Completing The Initial Ubuntu Os Configuration

    Completing the Initial Ubuntu OS Configuration When you power on the DGX Station A100 for the first time, you are prompted to accept end user license agreements for NVIDIA software. You are then guided through the process for completing the initial Ubuntu OS configuration.
  • Page 23: Chapter 3. Using The Bmc

        The DGX Station A100 features two display devices, the DGX Display Adapter and the BMC Display Adapter. Depending on the BIOS settings you can direct the OS X display to either adapter. These settings can be adjusted to use one of the three modes listed before from the BIOS.
  • Page 24: Understanding The Bmc Controls

    Configures the X-windowing system to exclusively use the on-board BMC Display Adapter. If for any reason BMC Display Adapter is not present, it will fall back to use NVIDIA DGX Display Adapter (if present), otherwise the service will exit without any setup.
  • Page 25: Configuring A Static Ip Address For The Bmc

    Here is some information about how to set a static IP address for the BMC from the Ubuntu command line. Note: If you cannot access the DGX Station A100 remotely, connect a display (1440x900 or lower resolution) and keyboard directly to the DGX Station A100.
  • Page 26: Configuring A Bmc Static Ip Address Using The System Bios

    System BIOS This section describes how to set a static IP address for the BMC when you cannot remotely access the DGX Station A100. This process involves setting the BMC IP address during system boot. 1. Connect a keyboard and display (1440 x 900 maximum resolution) to the DGX A100 System and power on the DGX Station A100.
  • Page 27: Changing Your Default Bmc Password

    Administrator users revert to the following default passwords: ‣ admin (for the admin role) ‣ superuser (for the Administrator role) All other users on the system will be deleted. DGX Station A100 DU-10189-001 _v5.0.2   |   21...
  • Page 28: Logging In After Entering An Incorrect Password

    Configuring the BMC Login Credentials Here is some information about how to add or remove users from the BMC. 1. Log into the BMC. 2. In the left navigation pane, click Settings. 3. Click the User Management card. DGX Station A100 DU-10189-001 _v5.0.2   |   22...
  • Page 29: Using The Remote Control

    Here is some information about setting up Active Directory or LDAP/E-Directory in BMC. 1. In the left navigation pane, click Settings. 2. Click External User Services. 3. Click one of the following options and follow the instructions: DGX Station A100 DU-10189-001 _v5.0.2   |   23...
  • Page 30: Configuring Platform Event Filters

    On the SSL Setting page, click View SSL Certificate. The View SSL Certificate page displays the following basic information about the uploaded SSL certificate: ‣ Certificate Version, Serial Number, Algorithm, and Public Key ‣ Issuer information DGX Station A100 DU-10189-001 _v5.0.2   |   24...
  • Page 31: 3.7.5.2.  Generating The Ssl Certificate

    City or Locality (L) (Mandatory) City or Locality of the organization ‣ Maximum length of 64 alpha-numeric characters. ‣ The special characters, are not allowed. ‣ Maximum length of 64 alpha-numeric characters. DGX Station A100 DU-10189-001 _v5.0.2   |   25...
  • Page 32: 3.7.5.3.  Uploading The Ssl Certificate

    2. Click the New Certificate folder icon, browse to locate the appropriate file, and select it. 3. Click the New Private Key folder icon, browse and locate the appropriate file, and select it. 4. Click Save. DGX Station A100 DU-10189-001 _v5.0.2   |   26...
  • Page 33: Chapter 4. Enable Mig Mode In Dgx Station A100

    Station A100 Here is some information about how you can enable the Multi-Instance GPU (MIG) mode. 1. By default, MIG mode is not enabled on the DGX Station A100. For example, when you run , the output shows that MIG mode is disabled:...
  • Page 34 00000000:07:00.0 is currently being used by one or more other processes (e.g. CUDA application or a monitoring application such as another instance of nvidia-smi). Please first kill all processes using the device and retry the command or reboot the system to make MIG mode effective.
  • Page 35: Chapter 5. Managing Self-Encrypting Drives On Dgx Station A100

    The DGX OS software supports the ability to manage self-encrypting drives (SEDs), including setting an Authentication Key to lock and unlock DGX Station A100 system drives. You can manage only SED data drives, and the software cannot be used to manage OS drives, even if the drives are SED-capable.
  • Page 36: Installing The Nv-Disk-Encrypt Package

    ‣ generates random salt values (stored in /etc/nv-disk-encrypt/ .dgxenc.salt) for each drive password. NVIDIA strongly recommends using this option for best security, otherwise the software will use a default salt value instead of a randomly generated one.
  • Page 37: Enabling Drive Locking

    Managing Self-Encrypting Drives on DGX Station A100 This avoids the need to create a JSON file or the need to enter a password one by one during the initialization. 5.4.  Enabling Drive Locking Here is some information about how to enable drive locking.
  • Page 38: 5.5.1.2.  Creating The Drive/Password Mapping Json File

    Managing Self-Encrypting Drives on DGX Station A100 The following example output shows drives than cannot be used for encryption. SED capable = and Boot disk = , or SED capable = Disk(s) that cannot be used for encryption +------+------+------+------------------------------------------------------------- | Name...
  • Page 39: Example 2: Generating Random Passwords

    Managing Self-Encrypting Drives on DGX Station A100 2. Initialize the system and enable locking. The following command assumes you have placed the JSON file in the directory: /tmp sudo nv-disk-encrypt init -f /tmp/<your-file>.json -g sudo nv-disk-encrypt lock 3. When prompted, enter a password for the vault.
  • Page 40: Exporting The Vault

    RAID array cachefilesd CAUTION: When you complete this task, all data will be lost. On DGX Station A100 systems, these drives generally form a RAID 0 array, which will also be destroyed when you performe an erase.
  • Page 41: Configuring Trusted Computing

    Managing Self-Encrypting Drives on DGX Station A100 5.9.  Configuring Trusted Computing This section provides information about how to configure trusted computing. The DGX Station A100 system BIOS provides setup controls to configure the following Trusted Computing (TC) features: ‣ Trusted Platform Module DGX Station A100 incorporates Trusted Platform Module 2.0 (TPM 2.0), which can be...
  • Page 42: Changing Disk Passwords, Adding Disks, Or Replacing Disks

    5.11.  Recovering a Lost Key NVIDIA recommends that you back up your keys and store the keys in a secure location. If you lose the key that was used to initialize and lock your drives, you cannot unlock the drive.
  • Page 43: Chapter 6. Unpacking And Repacking The Dgx Station A100

    After you receive your DGX Station A100, carefully unpack it. CAUTION: The DGX Station A100 weighs 91 lbs (43.1 kg). Do not attempt to lift the DGX Station A100. Instead, move it into position by rolling it on its fitted casters.
  • Page 44 Unpacking and Repacking the DGX Station A100   2. Disengage and remove the packing clasps from the cutouts in the shipping carton. Do not use excessive force when removing the clasps to prevent them from becoming jammed inside the shipping carton.
  • Page 45 3. Raise the top cover of the shipping carton.     4. Fold down the ramp at the front of the bottom tray of the DGX Station A100 shipping carton and remove the packing material from the top.    ...
  • Page 46: Repacking The Dgx Station A100 For Shipment

    Unpacking and Repacking the DGX Station A100     6. Roll the DGX Station A100 off of the packaging by using the ramp and carefully roll the DGX Station A100 down the ramp.     6.2.  Repacking the DGX Station A100 for...
  • Page 47   2. Swing the door closed.     3. Fold up the ramp at the front of the bottom tray of the DGX Station A100 shipping carton and place the packing material on top.   DGX Station A100 DU-10189-001 _v5.0.2   |   41...
  • Page 48 Unpacking and Repacking the DGX Station A100   4. Lower the top cover of the shipping carton.     5. Re-engage and insert the packing clasps into the cutouts in the shipping carton. Do not use excessive force when inserting the clasps to prevent them from becoming jammed inside the shipping carton.
  • Page 49 Unpacking and Repacking the DGX Station A100   6. Close the top flap and place the Accessory box and the power cord box.     DGX Station A100 DU-10189-001 _v5.0.2   |   43...
  • Page 50: Appendix A. Security

    After you complete the initial configuration of the system, and the BMC username and password have been entered, complete the following steps: 1. Remove the yellow sticker from the rear IO panel of the DGX Station A100. 2. Remove and remove the dust cover to expose the Ethernet RJ45 port.
  • Page 51: Appendix B. Safety

    To reduce the risk of bodily injury, electrical shock, fire, and equipment damage, read this document and observe all warnings and precautions in this guide before installing or maintaining your product. NVIDIA products are designed to operate safely when installed and used according to the product instructions and general safety practices. The guidelines included in this document explain the potential risks associated with computer operation and provide important safety practices designed to minimize these risks.
  • Page 52: B.2. General Precautions

    Follow all cautions and instructions marked on the equipment. Do not attempt to defeat safety interlocks (where provided). ‣ Operate the DGX Station A100 in a place where the temperature is always in the range 10°C to 35°C (50°F to 95°F). B.3. ...
  • Page 53: B.4. Communications Cable Precautions

    Do not connect communications cables during an electrical storm. There may be a risk of electric shock from lightning. ‣ Do not connect or use communications cables in a wet location. ‣ Disconnect the communications cables before opening a product enclosure, or touching or installing internal components. DGX Station A100 DU-10189-001 _v5.0.2   |   47...
  • Page 54: B.5.  Other Hazards

    Nickel The decorative metal foam on the DGX Station A100 casework contains some nickel. The metal foam is not intended for direct and prolonged skin contact. While nickel exposure is unlikely to be a problem, you should be aware of the possibility in case you’re susceptible to nickel- related reactions.
  • Page 55: Appendix C. Connections, Controls, And Indicators

    Appendix C. Connections, Controls, and Indicators C.1.  Front-Panel Connections and Controls Type Description Power Button Press to turn the DGX Station A100 on or off. DGX Station A100 DU-10189-001 _v5.0.2   |   49...
  • Page 56: C.2. Rear-Panel Connections And Controls

    AC Input Power supply input Reset Button Press to reboot the system without turning off the system power DisplayPort Ports for connecting up to 3 displays Power Supply Switch Turn the power supply on and off DGX Station A100 DU-10189-001 _v5.0.2   |   50...
  • Page 57: C.3. Lan Port Indicators

    Connections, Controls, and Indicators C.3.  LAN Port Indicators LEDs on each Ethernet LAN port indicate the connection status as illustrated in the following figure and described in the following tables.   DGX Station A100 DU-10189-001 _v5.0.2   |   51...
  • Page 58 Connections, Controls, and Indicators   Speed LED Status Description 100 Mbps connection Orange 1 Gbps connection Green 10 Gbps connection Activity/Link LED Status Description No link Green Linked Green (blinking) Data activity DGX Station A100 DU-10189-001 _v5.0.2   |   52...
  • Page 59: Appendix D. Compliance

    Appendix D. Compliance ™ The NVIDIA DGX Station A100 is compliant with the regulations listed in this section. D.1.  DGX Station A100 Model Number Model: P3487 D.2.  Australia/New Zealand Australian Communications and Media Authority (RCM) This product meets the applicable EMC requirements for Class A, I.T.E equipment D.3. ...
  • Page 60: D.4.  Canada

    The Class A digital apparatus meets all requirements of the Canadian Interference-Causing Equipment Regulation. Cet appareil numérique de la classe A respecte toutes les exigences du Règlement sur le matériel brouilleur du Canada. D.5.  China RoHS Material Content       DGX Station A100 DU-10189-001 _v5.0.2   |   54...
  • Page 61: D.6.  European Union

    ‣ ErP Directive (2009/125/EC) for European Ecodesign. A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA GmbH ("Bavaria Towers – Blue Tower, Einsteinstrasse 172, D-81677 Munich Germany") DGX Station A100 DU-10189-001 _v5.0.2   |   55...
  • Page 62: D.7.  India

    It does not contain lead, mercury, hexavalent chromium, polybrominated biphenyls or polybrominated diphenyl ethers in concentrations exceeding 0.1 weight % and 0.01 weight % for cadmium, except for where allowed pursuant to the exemptions set in Schedule 2 of the Rule.” DGX Station A100 DU-10189-001 _v5.0.2   |   56...
  • Page 63: D.8.  Japan

    Compliance D.8.  Japan Voluntary Control Council for Interference (VCCI) Japan RoHS Material Content Declaration DGX Station A100 DU-10189-001 _v5.0.2   |   57...
  • Page 64: D.9.  Mexico

    South African Bureau of Standards (SABS) This device complies with the following SABS Standards: SANS 2332: 2017/CISPR 32:2015 SANS 2335:2018/ CISPR 35:2016 National Regulator for Compulsory Specifications (NRCS) This device complies with following standard under VC 8055 DGX Station A100 DU-10189-001 _v5.0.2   |   58...
  • Page 65: D.12.  South Korea

    Compliance SANS IEC 60950-1 D.12.  South Korea Korean Certification (KC) D.13.  Taiwan Bureau of Standards, Metrology & Inspection (BSMI) DGX Station A100 DU-10189-001 _v5.0.2   |   59...
  • Page 66: D.14.  United Kingdom

    Compliance D.14.  United Kingdom UK Conformity Assessed (UKCA)   DGX Station A100 DU-10189-001 _v5.0.2   |   60...
  • Page 67: D.15.  United States

    Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense. DGX Station A100 DU-10189-001 _v5.0.2   |   61...
  • Page 68: D.16. United States/Canada

    Compliance D.16.  United States/Canada TUV Rheinland     Energy Star     Energy Star qualified server DGX Station A100 DU-10189-001 _v5.0.2   |   62...
  • Page 69: Appendix E. Dgx Station A100 Hardware Specifications

    Appendix E. DGX Station A100 Hardware Specifications E.1.  Environmental Conditions Condition Operating Range Nonoperating Range Ambient temperature 10°C to 35°C (50°F to 95°F) 5°C to 40°C (41°F to 104°F) Relative humidity 10% to 80% (non-condensing) 10% to 80% (non-condensing) E.2.  Component Specifications...
  • Page 70: E.3. Mechanical Specifications

    20.4” (518 mm) Gross weight 91 lbs (43.1 kg) E.4.  Power Specifications RESTRICTION: The power source for DGX Station A100 must be 100V and cannot fall below 90V. Input Comments 100-115VAC/15A, The DGX Station A100 power consumption can reach 1,500 W (ambient 115-120VAC/12A, temperature 30°C) with all system resources under a heavy load.
  • Page 71: Appendix F. Customer Support For The Nvidia Dgx Station ™ A100

    A100 Contact NVIDIA Enterprise Support for assistance in reporting, troubleshooting, or diagnosing problems with your DGX Station A100 system. Also contact NVIDIA Enterprise Support for assistance in installing or moving the DGX Station A100 system. For details on how to obtain support, visit the NVIDIA Enterprise Support web site (https:// www.nvidia.com/en-us/support/enterprise/).
  • Page 72 NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.

Table of Contents