Nvidia DGX Station User Manual

Nvidia DGX Station User Manual

Hide thumbs Also See for DGX Station:
Table of Contents

Advertisement

DGX STATION
DU-08255-001 _v2.1 | May 2018
User Guide

Advertisement

Table of Contents
loading

Summary of Contents for Nvidia DGX Station

  • Page 1 DGX STATION DU-08255-001 _v2.1 | May 2018 User Guide...
  • Page 2: Table Of Contents

    2.8. Enabling Multiple Users to Access the DGX Station Remotely........15 2.9. Preparing the DGX Station for Use with Docker........... 15 2.9.1. Enabling Users To Run Docker Containers............15 2.9.2. Preventing IP Address Conflicts Between Docker and the DGX Station....16 Chapter 3. Updating DGX Station Software............... 18 3.1. Updating DGX Station Software from the Details Window........18 3.2. Updating DGX Station Software from the Command Line........
  • Page 3 4.7.3. Verifying the Bootable Installation Medium........... 45 4.7.3.1. Verifying a Bootable USB Flash Drive............. 45 4.7.3.2. Verifying a Bootable DVD-ROM............. 46 4.7.4. Installing the DGX Station Software Image from a USB Flash Drive or DVD-ROM..47 4.8. Updating the DGX Station System BIOS..............48 4.9. Maintaining the GPU Liquid Cooling System............49 4.9.1. Monitoring GPU Temperatures..............
  • Page 4 C.16.  United States/Canada.................74 C.17.  Vietnam....................74 Appendix D. DGX Station Hardware Specifications............75 D.1.  Environmental Conditions................75 D.2.  Component Specifications................75 D.3.  Mechanical Specifications................76 D.4.  Power Specifications..................76 www.nvidia.com DGX Station DU-08255-001 _v2.1 | iv...
  • Page 5: About This Guide

    For details about the DGX OS Desktop software for the DGX Station, refer to DGX OS Desktop Release Notes. For information about how to use the DGX Station to download and run containers for deep learning frameworks, refer to DGX Container Registry User Guide.
  • Page 6 About this Guide www.nvidia.com DGX Station DU-08255-001 _v2.1 | vi...
  • Page 7: Chapter 1. Introduction To The Nvidia ® Dgx Station

    AI analytics. You can use the DGX Station to run neural networks, and deploy deep learning models. Because the DGX Station is software compatible with the NVIDIA DGX-1 server, you can also use the DGX Station to optimize applications to run on a production DGX-1 cluster.
  • Page 8: What's In The Box

    1.2. DGX OS Desktop Software Summary The DGX OS Desktop software that is supplied with the DGX Station includes the software that you need for downloading and running containers for deep learning frameworks. The software is already installed on the DGX Station, except where licensing requirements mandate that the software be supplied separately.
  • Page 9 ® ™ Introduction to the NVIDIA DGX Station System Memory and Storage Unit Total Component Capacity Capacity Description System memory 32 GB 256 GB ECC Registered LRDIMM DDR4 SDRAM Data storage 1.92 TB 5.76 TB 2.5" 6 Gb/s SATA III SSD in RAID 0 configuration OS storage 1.92 TB...
  • Page 10: Chapter 2. Setting Up The Nvidia Dgx Station

    2.1. Siting the DGX Station Caution The DGX Station weighs 88 lbs (40 kg). Do not attempt to lift the DGX Station. Instead, remove the DGX Station from its packaging and move it into position by rolling it on its fitted casters.
  • Page 11: Removing Or Replacing The Packing Inside The Dgx Station

    The power cable, all communications cables, and any peripheral devices such as displays and keyboards are disconnected from the DGX Station. Push the button on the right side of the DGX Station back panel to release the side panel on the right of the DGX Station when viewed from the rear.
  • Page 12 ‣ To replace the foam packing piece, gently push it into position around the GPU cards inside the DGX Station. Align the bottom edge of the side panel with the bottom edge of the DGX Station. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 6...
  • Page 13: Connecting And Powering On The Dgx Station

    Display with power cable and connector cable terminated in a DisplayPort connector or HDMI connector If your display connector cable is terminated in an HDMI connector, you can use one of the supplied adapters to connect the cable to the DGX Station. ‣ USB keyboard ‣...
  • Page 14 Configuring the DGX Station To Use Multiple Displays. Use any of the two Ethernet ports to connect the DGX Station to your LAN with Internet connectivity. Connect only one Ethernet port on the DGX Station to the Internet unless you plan to configure the ports manually and disable DHCP on at least one of the ports.
  • Page 15 Setting Up the NVIDIA DGX Station then alternate between these addresses, causing the OS and applications to malfunction. Make sure that the power supply rocker switch is in the OFF position. Current units: Earlier units: Connect the supplied power cable from the power socket at the back of the unit to an appropriately rated, grounded AC outlet.
  • Page 16 Connect the display to a suitable AC outlet and power on the display. Move the DGX Station power supply rocker switch to the ON position. www.nvidia.com DGX Station...
  • Page 17 Setting Up the NVIDIA DGX Station Current units: Earlier units: Push the Power button on the front of the unit to power on the DGX Station. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 11...
  • Page 18: Completing The Initial Ubuntu Os Configuration

    2.4. Completing the Initial Ubuntu OS Configuration When you power on the DGX Station for the first time, you are prompted to accept end user license agreements for NVIDIA software. You are then guided through the process for completing the initial Ubuntu OS configuration. As part of this process, you are prompted to create your user name and password for logging in to the DGX Station.
  • Page 19: Registering Your Dgx Station

    DisplayPort connectors, enabling you to connect up to three displays to the DGX Station. If you want to use more than one display with the DGX Station, configure it to use multiple displays after you complete the initial Ubuntu OS configuration.
  • Page 20 High-resolution displays consume a large quantity of GPU memory. If you have connected three 4K displays to the DGX Station, they may consume most of the GPU memory on the NVIDIA Tesla V100 GPU card to which they are connected, especially if you are running graphics-intensive applications.
  • Page 21: Enabling Multiple Users To Access The Dgx Station Remotely

    To enable multiple users to access the DGX Station remotely, secure shell (SSH) server is installed and enabled on the DGX Station. Add other Ubuntu OS users to the DGX Station to allow them to log in remotely to the DGX Station through SSH.
  • Page 22: Preventing Ip Address Conflicts Between Docker And The Dgx Station

    Setting Up the NVIDIA DGX Station and who are aware of the potential risks to the DGX Station of running commands with sudo privileges are able to run Docker containers. Before allowing multiple users to run commands with sudo privileges, consult your IT department to determine whether you would be violating your organization's security policies.
  • Page 23 Setting Up the NVIDIA DGX Station container-ip-address-range The container IP address range to be used by Docker containers, for example, 192.168.127.128/25 This example shows a complete /etc/systemd/system/docker.service.d/ docker-override.conf file that has been edited to specify the bridge IP address range and container IP address range to be used by Docker containers.
  • Page 24: Chapter 3. Updating Dgx Station Software

    Ensure that you are logged in to your Ubuntu desktop on the DGX Station as an administrator user. From the Ubuntu system menu at the top right of the desktop, choose About This Computer.
  • Page 25 Updating DGX Station Software In the Details window, click Install Updates. In the Software Updater window that opens, review the available updates and click Install Now. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 19...
  • Page 26 If the list contains only packages that you want to remove, click Start Upgrade. When prompted to authenticate, type your password into the Password field and click Authenticate. If necessary, restart your DGX Station when prompted to complete the updates. www.nvidia.com DGX Station...
  • Page 27: Updating Dgx Station Software From The Command Line

    (http://manpages.ubuntu.com/manpages/xenial/en/man8/apt.8.html) command to update DGX Station software from the command line. Ensure that you are logged in to your Ubuntu desktop on the DGX Station as an administrator user. Download information from all configured sources about the latest versions of the packages.
  • Page 28: Updates To The Ubuntu Software On The Dgx Station

    The repository maintained by NVIDIA is enabled by default in Ubuntu Software & Updates, Other Software on the DGX Station, as shown in the following screen capture. Although a Docker repository is also enabled, DGX Station no longer uses this repository to obtain updates to Docker because the repository maintained by NVIDIA takes precedence over the Docker repository.
  • Page 29: Checking For Updates To Dgx Station Software

    By default, the DGX Station does not notify you of available updates or automatically install any updates, including important security updates. To minimize the risk to your DGX Station from security vulnerabilities, you must ensure that it is kept up to date with the latest important security updates.
  • Page 30: Getting Release Information For Dgx Station

    The version number and update date of each over-the-network update applied since the software was last installed from an ISO image You can use this information to determine if your DGX Station is running the current version of the DGX OS Desktop software.
  • Page 31: Updating Software On An Air-Gapped Dgx Station System

    Debian Repository Setup (https://wiki.debian.org/ DebianRepository/Setup) on the Debian wiki. Update the sources that provide updates to the DGX Station to use your private repository instead of the public repositories. You can update these sources by modifying the /etc/apt/sources.list file and the contents of /etc/apt.sources.list.d/ directory, or by using System Settings, Software &...
  • Page 32 Internet connection to the air-gapped system. On a system with an Internet connection, log in to the NVIDIA DGX Container Registry and load the container image that you want.
  • Page 33: Chapter 4. Maintaining And Servicing The Nvidia Dgx Station

    To prevent dust from entering the DGX Station through the ventilation holes under the unit, a mesh filter is fitted to the underside of the DGX Station. Clean this mesh filter periodically to prevent the accumulation of dust on the filter from impeding the flow of air through the DGX Station.
  • Page 34: Collecting Information For Troubleshooting The Dgx Station

    Maintaining and Servicing the NVIDIA DGX Station Use compressed air to blow the dust from the mesh filter. Line up the mesh filter with the runners under the DGX Station and slide it back into position under the unit. 4.3. Collecting Information for Troubleshooting...
  • Page 35: Checking The Health Of The Dgx Station

    For DGX OS Desktop releases 3.1.1 through 3.1.3, the file name is sys-info-timestamp.random-number.out Use any method that is convenient for you to send the file to NVIDIA Support Enterprise Services. For example, send the file as an e-mail attachment. 4.4. Checking the Health of the DGX Station The DGX Station provides the NVIDIA System Health Checker (nvhealth) tool to exercise the system and verify its health.
  • Page 36: Replacing The System

    Caution The DGX Station weighs 88 lbs (40 kg). Do not attempt to lift the DGX Station. Instead, move it into position by rolling it on its fitted casters.
  • Page 37 Maintaining and Servicing the NVIDIA DGX Station Roll the DGX Station up the ramp into the bottom tray of its shipping carton. Caution Ensure that you have a second person to help you roll the DGX Station into position. Insert the front packing piece into the tray, ensuring that the lip of the packing piece is under the DGX Station.
  • Page 38 Keep the AC power cable to use with your replacement DGX Station. Place both accessory boxes in the slots in the tray on each side of the DGX Station. Ensure that the lugs that protrude from the edges of each accessory box are facing away from the DGX Station.
  • Page 39: Maintaining The Dgx Station Persistent Storage

    Storage. 4.6.1. Changing the RAID Level of the RAID Array As supplied from the factory, the RAID level of the DGX Station RAID array is RAID 0. RAID 0 provides the maximum storage capacity, but does not provide any redundancy. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 33...
  • Page 40: Checking The Status Of The Dgx Station Raid Array

    RAID 0 to RAID 5, the total storage capacity of the RAID array is reduced from 5.76 TB to 3.84 TB. Before changing the RAID level of the DGX Station RAID array, back up all data on the array that you want to preserve. Changing the RAID level of the DGX Station RAID array erases all data stored on the array.
  • Page 41: Checking The Status Of The Dgx Station Ssds

    4.6.3. Checking the Status of the DGX Station SSDs LEDs on the DGX Station SSDs indicate the status of the SSDs. The SSDs are mounted inside the DGX Station and are visible only when the side panel that covers the SSDs is removed.
  • Page 42: Replacing An Ssd

    The SSD has failed and must be replaced. Replace the side panel of the DGX Station. a) Align the bottom edge of the side panel with the bottom edge of the DGX Station. b) Firmly push the panel back into place to re-engage the latches.
  • Page 43 Maintaining and Servicing the NVIDIA DGX Station Pull the drive-tray latch upwards to unseat the drive tray. Slide the drive tray upwards to completely remove it from the unit. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 37...
  • Page 44 Maintaining and Servicing the NVIDIA DGX Station Using a Phillips screwdriver, remove the four screws attaching the SSD to the drive tray. Save the screws for the replacement SSD. Slide the SSD out of the drive tray. Slide the replacement SSD into the drive tray.
  • Page 45 Replace the side panel of the DGX Station. a) Align the bottom edge of the side panel with the bottom edge of the DGX Station. b) Firmly push the panel back into place to re-engage the latches. What you need to do to return the DGX Station to service depends on whether you replaced an SSD in the RAID array the OS SSD.
  • Page 46: Rebuilding The Dgx Station Raid Array

    Restoring the DGX Station Software Image. 4.6.5. Rebuilding the DGX Station RAID Array If the DGX Station RAID array is degraded because an SSD failed, replace the SSD as explained in Replacing an SSD. After replacing a failed SSD in the RAID array, you must rebuild the array to add the new SSD to a RAID 0 array or to regenerate the lost data on the new SSD in a RAID 5 array.
  • Page 47: Restoring The Dgx Station Software Image

    Maintaining and Servicing the NVIDIA DGX Station 4.7. Restoring the DGX Station Software Image If the DGX Station software image becomes corrupted or the OS SSD was replaced after a failure, restore the DGX Station software image to its original factory condition from a pristine copy of the image.
  • Page 48: Creating A Bootable Installation Medium

    USB flash drive that contains the DGX Station software image. Ensure that the following prerequisites are met: ‣ The correct DGX Station software image is saved to your local disk. For more information, see Obtaining the DGX Station Software ISO Image and Checksum File.
  • Page 49: Creating A Bootable Usb Flash Drive By Using Akeo Rufus

    Maintaining and Servicing the NVIDIA DGX Station If the DGX Station software image file is not listed, click Other and in the window that opens, navigate to the file, select the file, and click Open. From the Disk to use list, select the USB flash drive and click Make Startup Disk.
  • Page 50 Select the Create a bootable disk using option and from the dropdown menu, select ISO image. Click the optical drive icon and open the DGX Station software ISO image. Click Start. Because the image is a hybrid ISO file, you are prompted to select whether to write the image in ISO Image (file copy) mode or DD Image (disk image) mode.
  • Page 51: Verifying The Bootable Installation Medium

    $ lsblk You can identify the USB flash drive from its size, which is much smaller than the size of the SSDs in the DGX Station, and from the mount points of any partitions on the drive, which are under /media.
  • Page 52: Verifying A Bootable Dvd-Rom

    Maintaining and Servicing the NVIDIA DGX Station This example computes the checksum of an image on the USB flash drive with device ID /dev/sde1 using a block size of 1 MB. $ sudo dd if=/dev/sde1 bs=1M | cksum 3299+1 records in 3299+1 records out 3459317760 bytes (3.5 GB, 3.2 GiB) copied, 164.369 s, 21.0 MB/s...
  • Page 53: Installing The Dgx Station Software Image From A Usb Flash Drive Or Dvd-Rom

    OS SSD and will be erased. However, if you chose to install the DGX Station software and preserve the RAID array contents, persistent data stored in the RAID array is unaffected.
  • Page 54: Updating The Dgx Station System Bios

    Unplug the USB flash drive or optical drive from the DGX Station. 4.8. Updating the DGX Station System BIOS If you need to update the DGX Station system BIOS, you can obtain the current version of it from NVIDIA Support Enterprise Services.
  • Page 55: Maintaining The Gpu Liquid Cooling System

    Maintaining and Servicing the NVIDIA DGX Station Press Enter to start the BIOS update process. Caution To avoid the risk of leaving your DGX Station unable to boot, do not shut down or reset the DGX Station during the BIOS update process.
  • Page 56: Checking The Level Of The Liquid In The Gpu Cooling System

    Remove the side panel on the right of the DGX Station when viewed from the rear. a) Push the button on the right side of the DGX Station back panel to release the panel. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 50...
  • Page 57 Lift the panel to remove it. Caution To prevent damage from electrostatic discharge, avoid touching any of the components inside the DGX Station other than any components that you are replacing or servicing. Look at the gauge on the side of the cooling system pump to determine the level of the liquid in the cooling system.
  • Page 58 Replenishing the Liquid in the GPU Cooling System. Replace the side panel of the DGX Station. a) Align the bottom edge of the side panel with the bottom edge of the DGX Station. b) Firmly push the panel back into place to re-engage the latch. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 52...
  • Page 59: Replenishing The Liquid In The Gpu Cooling System

    1 bottle of EK-CryoFuel Clear Premix coolant Caution Use only EK-CryoFuel Clear coolant. Do not use any other type of coolant. Use of other types of coolant will void the DGX Station hardware warranty and may cause damage to or impair the performance of the system.
  • Page 60 Replace the filler cap at top of the pump and use the Torx T20 Allen wrench to tighten the cap until it is finger tight. Do not over tighten the filler cap. Power on the DGX Station and let it run for one minute. www.nvidia.com DGX Station...
  • Page 61 Check the level of the liquid in the cooling system. Power off the DGX Station. Replace the side panel of the DGX Station. a) Align the bottom edge of the side panel with the bottom edge of the DGX Station. www.nvidia.com DGX Station...
  • Page 62 Maintaining and Servicing the NVIDIA DGX Station b) Firmly push the panel back into place to re-engage the latch. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 56...
  • Page 63: Appendix  A.  Safety

    To reduce the risk of bodily injury, electrical shock, fire, and equipment damage, read this document and observe all warnings and precautions in this guide before installing or maintaining your product. NVIDIA products are designed to operate safely when installed and used according to the product instructions and general safety practices.
  • Page 64: A.1. Intended Application Uses

    Follow all cautions and instructions marked on the equipment. Do not attempt to defeat safety interlocks (where provided). ‣ Operate the DGX Station in a place where the temperature is always in the range 10°C to 30°C (50°F to 86°F). A.3. Electrical Precautions...
  • Page 65: A.4. Communications Cable Precautions

    To reduce the risk of exposure to electrical shock hazards from communications cables: ‣ Do not connect communications cables during an electrical storm. There may be a risk of electric shock from lightning. ‣ Do not connect or use communications cables in a wet location. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 59...
  • Page 66: A.5. Other Hazards

    Nickel The decorative metal foam on the DGX Station casework contains some nickel. The metal foam is not intended for direct and prolonged skin contact. While nickel exposure is unlikely to be a problem, you should be aware of the possibility in case you’re susceptible to nickel-related reactions.
  • Page 67: Appendix B. Connections, Controls, And Indicators

    Appendix B. CONNECTIONS, CONTROLS, AND INDICATORS B.1. Front-Panel Connections and Controls Type Description Power Button Press to turn the DGX Station on or off B.2. Rear-Panel Connections and Controls Current Units Type Description USB 3.1 Type-C USB 3.1 Type-C port Ethernet 10G LAN ports (see...
  • Page 68 Turn the power supply on and off Earlier Units Type Description USB 3.1 Type-C USB 3.1 Type-C port Ethernet 10G LAN ports (see LAN Port Indicators): ‣ Lower port: LAN 1 ‣ Upper port: LAN 2 www.nvidia.com DGX Station DU-08255-001 _v2.1 | 62...
  • Page 69: B.3. Lan Port Indicators

    Ports for connecting up to 3 displays AC Input Power supply input B.3. LAN Port Indicators LEDs on each Ethernet LAN port indicate the connection status as illustrated in the following figure and described in the following tables. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 63...
  • Page 70: B.4. Audio I/O Connections

    Mic In Mic In Mic In Black Rear Speaker Rear Speaker Rear Speaker Orange Center/Subwoofer Center/Subwoofer Light Blue Line In Line In Line In Side Speaker Lime Green Line Out Front Speaker Front Speaker Front Speaker www.nvidia.com DGX Station DU-08255-001 _v2.1 | 64...
  • Page 71 Connections, Controls, and Indicators www.nvidia.com DGX Station DU-08255-001 _v2.1 | 65...
  • Page 72: Appendix  C.  Compliance

    Appendix C. COMPLIANCE The NVIDIA DGX Station is compliant with the regulations listed in this section. C.1. DGX Station Model Number Model: P2587 C.2. Argentina S-Mark C.3. Australia/New Zealand www.nvidia.com DGX Station DU-08255-001 _v2.1 | 66...
  • Page 73: C.4. Brazil

    CAN ICES-3(A)/NMB-3(A) The Class A digital apparatus meets all requirements of the Canadian Interference- Causing Equipment Regulation. Cet appareil numérique de la classe A respecte toutes les exigences du Règlement sur le matériel brouilleur du Canada. www.nvidia.com DGX Station DU-08255-001 _v2.1 | 67...
  • Page 74: C.6. China

    Compliance C.6. China RoHS Material Content www.nvidia.com DGX Station DU-08255-001 _v2.1 | 68...
  • Page 75: C.7. European Union

    RoHS Directive (2011/65/EU) for hazardous substances. ‣ ErP Directive (2009/125/EC) for European Ecodesign. A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA GmbH (Floessergasse 2, 81369 Munich, Germany). www.nvidia.com DGX Station DU-08255-001 _v2.1 | 69...
  • Page 76: C.8. India

    Compliance C.8. India Self Declaration - Conforming to IS13252:2010, R-41078743 C.9. Israel www.nvidia.com DGX Station DU-08255-001 _v2.1 | 70...
  • Page 77: C.10. Japan

    Compliance C.10. Japan VCCI C.11. Russia CU-TR C.12. South Africa Compliant with SANS IEC 60950 SABS Compliant with SANS 222 CISPR 22 www.nvidia.com DGX Station DU-08255-001 _v2.1 | 71...
  • Page 78: C.13. South Korea

    Compliance C.13. South Korea C.14. Taiwan BSMI www.nvidia.com DGX Station DU-08255-001 _v2.1 | 72...
  • Page 79: C.15. United States

    A digital device, pursuant to part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate www.nvidia.com DGX Station DU-08255-001 _v2.1 | 73...
  • Page 80: C.16. United States/Canada

    Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense. C.16. United States/Canada cULus Listing Mark C.17. Vietnam www.nvidia.com DGX Station DU-08255-001 _v2.1 | 74...
  • Page 81: Appendix D. Dgx Station Hardware Specifications

    NVIDIA Tesla V100-DGXS-32GB, featuring: units ‣ 4×125 TeraFLOPS (500 TeraFLOPS total), FP16 ‣ 4×32 GB (128 GB total) GPU memory ‣ 4×640 (2,560 total) NVIDIA Tensor Cores ‣ ® 4×5,120 (20,480 total) NVIDIA CUDA cores GPU - earlier NVIDIA Tesla V100-DGXS-16GB, featuring: units ‣...
  • Page 82: D.3. Mechanical Specifications

    Input Comments 115 - 240 VAC, 12-8A, The DGX Station power consumption can reach 1,500 W (ambient (50 - 60 Hz) temperature 30°C) with all system resources under a heavy load. Be aware of your electrical source’s power capability to avoid overloading the circuit.
  • Page 83 LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

Table of Contents