3.3. Opting in to DGX OS Desktop Patch Updates................25 3.4. Available DGX Station Software Updates................26 3.4.1. Updates to Docker and Software Exclusive to the DGX Station........27 3.4.2. Updates to the Ubuntu Software on the DGX Station.............28 3.5. Checking for Updates to DGX Station Software..............29 3.6. Getting Release Information for DGX Station................29...
Page 3
4.8.3. Verifying the Bootable Installation Medium..............67 4.8.3.1. Verifying a Bootable USB Flash Drive..............67 4.8.3.2. Verifying a Bootable DVD-ROM.................68 4.8.4. Installing the DGX Station Software Image from a USB Flash Drive or DVD-ROM ..68 4.9. Updating the DGX Station System BIOS................69 4.10. Maintaining the GPU Liquid Cooling System...............70 4.10.1. Monitoring GPU Temperatures..................71...
Page 4
C.13. South Korea.......................... 93 C.14. Taiwan............................93 C.15. United States.........................94 C.16. United States/Canada......................95 C.17. Vietnam..........................95 Appendix D. DGX Station Hardware Specifications............96 D.1. Environmental Conditions...................... 96 D.2. Component Specifications......................96 D.3. Mechanical Specifications......................97 D.4. Power Specifications......................97 Appendix E. Customer Support for the NVIDIA DGX Station..........98 DGX Station DU-08255-001 _v4.6 | iv...
Note: The instructions in this guide for software administration apply only to the DGX OS Desktop. They don't apply if the DGX OS Desktop software that is supplied with the DGX Station has been replaced with the DGX software for Red Hat Enterprise Linux or CentOS.
Page 6
About this Guide DGX Station DU-08255-001 _v4.6 | vi...
™ DGX Station The NVIDIA DGX Station is a fast, multi-GPU workstation for deep learning and AI analytics. You can use the DGX Station to run neural networks, and deploy deep learning models. Because the DGX Station is software compatible with the NVIDIA DGX-1 server, you can also use the DGX Station to optimize applications to run on a production DGX-1 cluster.
1.2. DGX OS Desktop Software Summary The DGX OS Desktop software that is supplied with the DGX Station includes the software that you need for downloading and running containers for deep learning frameworks. The software is already installed on the DGX Station, except where licensing requirements mandate that the software be supplied separately.
GPU - current units NVIDIA Tesla V100-DGXS-32GB with 32 GB per GPU (128 GB total) of GPU memory GPU - earlier units NVIDIA Tesla V100-DGXS-16GB with 16 GB per GPU (64 GB total) of GPU memory System Memory and Storage Unit Total...
Siting the DGX Station CAUTION: The DGX Station weighs 88 lbs (40 kg). Do not attempt to lift the DGX Station. Instead, remove the DGX Station from its packaging and move it into position by rolling it on its fitted casters.
The power cable, all communications cables, and any peripheral devices such as displays and keyboards are disconnected from the DGX Station. 1. Push the button on the right side of the DGX Station back panel to release the side panel on the right of the DGX Station when viewed from the rear.
Page 12
To replace the foam packing piece, gently push it into position around the GPU cards inside the DGX Station. 4. Align the bottom edge of the side panel with the bottom edge of the DGX Station. DGX Station DU-08255-001 _v4.6 | 6...
Display with power cable and connector cable terminated in a DisplayPort connector or HDMI connector If your display connector cable is terminated in an HDMI connector, you can use one of the supplied adapters to connect the cable to the DGX Station. ‣ USB keyboard ‣...
Page 14
Ubuntu OS configuration, you can configure the DGX Station to use multiple displays. For details, see Configuring the DGX Station To Use Multiple Displays. 2. Use any of the two Ethernet ports to connect the DGX Station to your LAN with Internet connectivity. ...
Page 15
4. Connect the supplied power cable from the power socket at the back of the unit to an appropriately rated, grounded AC outlet. For details of the power consumption, input voltage, and current rating of the DGX Station, Power Specifications.
Page 16
5. Connect the display to a suitable AC outlet and power on the display. 6. Move the DGX Station power supply rocker switch to the ON position. DGX Station...
Page 17
Setting Up the NVIDIA DGX Station Current units: Earlier units: 7. Push the Power button on the front of the unit to power on the DGX Station. DGX Station DU-08255-001 _v4.6 | 11...
Note: To protect the DGX Station from unauthorized access, choose a strong password. The strength of the password you choose is indicated as you type it. After the Ubuntu OS configuration is complete, you can log in to the DGX Station to access your Ubuntu desktop.
One of the NVIDIA Tesla V100 GPU cards in the DGX Station provides three DisplayPort connectors, enabling you to connect up to three displays to the DGX Station. If you want to use more than one display with the DGX Station, configure it to use multiple displays after you complete the initial Ubuntu OS configuration.
Page 20
High-resolution displays consume a large quantity of GPU memory. If you have connected three 4K displays to the DGX Station, they may consume most of the GPU memory on the NVIDIA Tesla V100 GPU card to which they are connected, especially if you are running graphics-intensive applications.
To enable multiple users to access the DGX Station remotely, secure shell (SSH) server is installed and enabled on the DGX Station. Add other Ubuntu OS users to the DGX Station to allow them to log in remotely to the DGX Station through SSH.
Preparing the DGX Station for Use with Docker Some initial setup of the DGX Station is required to ensure that users have the required privileges to run Docker containers and to prevent IP address conflicts between Docker and the DGX Station.
If addresses within this 172.17.0.0/16 range are already used on the DGX Station network, change the Docker network to specify the bridge IP address range and container IP address range to be used by Docker containers. This task requires privileges.
Setting Up the NVIDIA DGX Station 2.10. Managing CPU Mitigations DGX OS Desktop includes security updates to mitigate CPU speculative side-channel vulnerabilities. These mitigations can decrease the performance of deep learning and machine learning workloads. If your installation of DGX systems incorporates other measures to mitigate these vulnerabilities, such as measures at the cluster level, you can disable the CPU mitigations for individual DGX nodes and thereby increase performance.
Setting Up the NVIDIA DGX Station 1. Install the package. nv-mitigations-off sudo apt install nv-mitigations-off -y 2. Reboot the system. 3. Verify CPU mitigations are disabled. cat /sys/devices/system/cpu/vulnerabilities/* The output should include several lines. See Determining the CPU Mitigation Vulnerable State of the DGX System for example output.
For details about the available updates, see Available DGX Station Software Updates. These updates may contain important security updates. To protect your DGX Station, keep your system up to date with the latest important security updates. For information about security updates for the Ubuntu OS, see Ubuntu Security Notices (https://usn.ubuntu.com/).
Major Release from the Software Updater Application Use the Software Updater applicaton to upgrade DGX Station in the same major release. Ensure that you are logged in to your Ubuntu desktop on the DGX Station as an administrator user. 1. Press the Super key.
DGX Station software within the same major release from the command line. Ensure that you are logged in to your Ubuntu desktop on the DGX Station as an administrator user. 1. Download information from all configured sources about the latest versions of the packages.
Page 29
4. Start the DGX OS Desktop release upgrade process. sudo dgx-release-upgrade If you are logged in to the DGX Station remotely through secure shell (SSH), you are asked if you want to continue running under SSH. Continue running under SSH? This session appears to be running under ssh.
Page 30
Upgrading DGX OS Desktop Software on DGX Station To continue please press [ENTER] b). In response to the prompt, press Enter to continue. You are warned that third-party sources are disabled. Third party sources disabled Some third party entries in your sources.list were disabled. You can re-enable them after the upgrade with the 'software-properties' tool or your package manager.
Ensure that the following prerequisites are met: ‣ You are logged in to your Ubuntu desktop on the DGX Station as an administrator user. ‣ Your DGX Station is upgraded to DGX OS Desktop release . DGX Station...
DGX Station is preset to obtain from these repositories updates to the following software: ‣ Docker ‣ Software that is exclusive to the DGX Station, including the CUDA Toolkit and CUDA Drivers packages ‣ Ubuntu software For more information about repositories, see Repositories/Ubuntu (https://help.ubuntu.com/...
Updates to Docker and Software Exclusive to the DGX Station Updates to Docker and to software that is exclusive to the DGX Station, including the CUDA Toolkit and CUDA Drivers packages, are available from a repository maintained by NVIDIA. CAUTION: ‣...
Note: By default, the DGX Station does not notify you of available updates or automatically install any updates, including important security updates. To minimize the risk to your DGX Station from security vulnerabilities, you must ensure that it is kept up to date with the latest important security updates.
Upgrading DGX OS Desktop Software on DGX Station Updates to another LTS base OS version are blocked because they can disrupt the DGX Station software and disable the NVIDIA graphics drivers. 3.5. Checking for Updates to DGX Station Software To check for software updates and to configure updates from the Ubuntu software repositories, use Software &...
The version number and update date of each over-the-network update applied since the software was last installed from an ISO image You can use this information to determine if your DGX Station is running the current version of the DGX OS Desktop software.
Debian Repository Setup (https://wiki.debian.org/ DebianRepository/Setup) on the Debian wiki. 3. Update the sources that provide updates to the DGX Station to use your private repository instead of the public repositories. You can update these sources by modifying the file and the /etc/apt/sources.list...
Page 38
Upgrading DGX OS Desktop Software on DGX Station 4. On the air-gapped system, load the container image from the local copy of the archive file that contains the image. docker load –i framework.tar 5. Confirm that the image is loaded on the air-gapped system.
To prevent dust from entering the DGX Station through the ventilation holes under the unit, a mesh filter is fitted to the underside of the DGX Station. Clean this mesh filter periodically to prevent the accumulation of dust on the filter from impeding the flow of air through the DGX Station.
Page 40
Maintaining and Servicing the NVIDIA DGX Station 3. Use compressed air to blow the dust from the mesh filter. 4. Line up the mesh filter with the runners under the DGX Station and slide it back into position under the unit. ...
Dump Health in NVIDIA System Management User Guide. To help diagnose and resolve issues, the DGX Station provides a tool to collect troubleshooting information for NVIDIA Support Enterprise Services. The tool verifies basic functionality and performance of the DGX Station and collects the following information in an xz-compressed tar archive: ‣...
3.1.4 Releases 3.1.1 through /tmp/nvidia-sys-info-timestamp.random-number.out 3.1.3 Use any method that is convenient for you to send the file to NVIDIA Support Enterprise Services. For example, send the file as an e-mail attachment. 4.5. DGX OS Desktop 4.3.0 and Earlier: Checking the Health of the DGX Station Note: Starting with release 4.4.0, the NVIDIA System Health Checker (...
NVIDIA unless you are directed otherwise. The following components are customer-replaceable: ‣ Solid State Drives (SSDs) Note: If you want to add SSDs for data storage to the DGX Station, obtain the SSDs from NVIDIA Enterprise Support. ‣ DIMMs Note: DIMMs are customer replaceable if a DIMM fails or to increase the system memory capacity to 512 GB.
4.6.2. Repacking the DGX Station for Shipment If you are returning the DGX Station to NVIDIA under an RMA, repack it in the packaging in which the replacement unit was advanced shipped to prevent damage during shipment. CAUTION: The DGX Station weighs 88 lbs (40 kg). Do not attempt to lift the DGX Station. Instead, move it into position by rolling it on its fitted casters.
Page 45
Maintaining and Servicing the NVIDIA DGX Station 2. Roll the DGX Station up the ramp into the bottom tray of its shipping carton. CAUTION: Ensure that you have a second person to help you roll the DGX Station into position.
Page 46
Keep the AC power cable to use with your replacement DGX Station. 6. Place both accessory boxes in the slots in the tray on each side of the DGX Station. Ensure that the lugs that protrude from the edges of each accessory box are facing away from the DGX Station.
32-GB DIMMs with 64-GB DIMMs to give a total capacity of 512 GB. Before attempting to replace a faulty DIMM, contact NVIDIA Enterprise Customer support for help in determining the location ID of the faulty DIMM that needs replacement.
Page 48
1. Turn off the DGX Station and disconnect the network and power cables. 2. Remove the side panel on the right of the DGX Station when viewed from the rear. a). Push the button on the right side of the DGX Station back panel to release the panel. ...
Page 49
CAUTION: To prevent damage from electrostatic discharge, avoid touching any of the components inside the DGX Station other than any components that you are replacing or servicing. 3. If you are replacing a faulty DIMM, use the following figure as a guide to locate the faulty DIMM.
Page 50
When the DIMM is correctly seated, the latch should be closed as shown in the following figure. 6. Replace the side panel of the DGX Station. a). Align the bottom edge of the side panel with the bottom edge of the DGX Station. DGX Station DU-08255-001 _v4.6 | 44...
Replacing the CMOS Power Cell in the DGX Station The CMOS power cell in the DGX Station provides power to the Real Time Clock (RTC) to maintain BIOS settings such as the system time and date while DGX Station is disconnected from the AC power supply.
Page 52
1. Turn off the DGX Station and disconnect the network and power cables. 2. Remove the side panel on the right of the DGX Station when viewed from the rear. a). Push the button on the right side of the DGX Station back panel to release the panel. ...
Page 53
+ sign facing you and press it into position. 5. Replace the side panel of the DGX Station. a). Align the bottom edge of the side panel with the bottom edge of the DGX Station. DGX Station...
Page 54
8. If necessary, set the system date and system time to the current time and date. a). At the first NVIDIA screen to appear while the system is rebooting, press F2 to access the UEFI BIOS Utility - EZ Mode screen.
Changing the RAID Level of the RAID Array As supplied from the factory, the RAID level of the DGX Station RAID array is RAID 0. RAID 0 provides the maximum storage capacity, but does not provide any redundancy. If a single SSD in the array fails, all data stored on the array is lost.
After you change the RAID level to RAID 5, the RAID array is rebuilt. A RAID array that is being rebuilt is online and ready to be used, but a check on the health of the DGX Station reports the status of the RAID volume as unhealthy. Therefore, avoid checking the health of the DGX Station while the RAID array is being rebuilt.
1. Remove the side panel on the left of the DGX Station when viewed from the rear. a). Push the button on the left side of the DGX Station back panel to release the panel. b). Lift the panel to remove it.
If you want to increase the capacity of the DGX Station RAID array, you can add four SSDs to the empty drive bays in the DGX Station. If an SSD in the DGX Station fails, replace the SSD to return the system to operation.
Page 59
1. Remove the side panel on the left of the DGX Station when viewed from the rear. a). Push the button on the left side of the DGX Station back panel to release the panel. b). Lift the panel to remove it.
Page 60
Maintaining and Servicing the NVIDIA DGX Station 4. Slide the drive tray upwards to completely remove it from the unit. 5. If you are replacing an SSD, remove the failed SSD from the drive tray. a). Using a Phillips screwdriver, remove the four screws attaching the SSD to the drive tray.
Page 61
Maintaining and Servicing the NVIDIA DGX Station 7. Secure the new or replacement SSD to the drive tray using the four screws that were supplied with the new SSD or secured the failed SSD. 8. With the drive-tray eject button at the right, insert the drive tray into the appropriate drive bay, then slide the drive tray all the way into the drive bay.
Rebuilding the DGX Station RAID Array After adding SSDs to the DGX Station, you must rebuild the RAID array to add the new SSDs to the array. After replacing a failed SSD in the RAID array, you must rebuild the array to add the new SSD to a RAID 0 array or to regenerate the lost data on the new SSD in a RAID 5 array.
If any data that you want to preserve is stored on the SSDs for data storage, move this data to another file system. 1. Optional: If the SSDs in the DGX Station for data storage are configured in a RAID 5 array, change the RAID level of the array to RAID 0.
Page 64
This example shows a complete file for configuring the cache /etc/cachefilesd.conf daemon for the DGX Station. The LSM security context is the default security context of the daemon. cachefilesd ############################################################################### # Copyright (C) 2006,2010 Red Hat, Inc. All Rights Reserved.
Sanitizing the DGX Station persistent storage permanently destroys all the data that was stored there. After the data is destroyed, it cannot be recovered. Sanitizing the DGX Station persistent storage involves sanitizing all the SSDs for data storage and the SSD for the operating system.
2. Load the USB flash drive or DVD-ROM into the DGX Station. ‣ If you are using a USB flash drive, plug it into one of the USB ports of the DGX Station. ‣ If you are using a DVD-ROM, connect an external optical drive to the DGX Station and load the DVD-ROM into the drive.
Page 67
You can identify the SSDs from their size, which is much larger than the size of any removable media that might be connected to the DGX Station, such as the USB flash drive from which you are running the Ubuntu Desktop LiveCD session.
6. When prompted by the Ubuntu Desktop OS, remove the installation medium and press Enter. After sanitizing all the DGX Station SSDs, return the DGX Station to service by installing the DGX Station software and re-initializing the RAID array. For instructions, see Installing the DGX Station Software Image from a USB Flash Drive or DVD-ROM.
Note: Updates to the DGX Station software might have been made available after the latest available ISO image file was created. To ensure that you have the latest DGX Station software, including security updates, check for updates and install any available updates after you restore the software image.
DGX Station software image. Ensure that the following prerequisites are met: ‣ The correct DGX Station software image is saved to your local disk. For more information, Obtaining the DGX Station Software ISO Image and Checksum File.
Maintaining and Servicing the NVIDIA DGX Station If the DGX Station software image file is not listed, click Other and in the window that opens, navigate to the file, select the file, and click Open. 5. From the Disk to use list, select the USB flash drive and click Make Startup Disk.
Page 72
4. Select the Create a bootable disk using option and from the dropdown menu, select ISO image. 5. Click the optical drive icon and open the DGX Station software ISO image. 6. Click Start. Because the image is a hybrid ISO file, you are prompted to select whether to write the image in ISO Image (file copy) mode or DD Image (disk image) mode.
You can identify the USB flash drive from its size, which is much smaller than the size of the SSDs in the DGX Station, and from the mount points of any partitions on the drive, which are under /media...
Before installing the DGX Station software image, ensure that you have a bootable USB flash drive or DVD-ROM that contains the current DGX Station software image. CAUTION: Installing the DGX Station software image erases all data stored on the OS SSD. The partition, where all users' documents, software settings, bookmarks, and other personal /home files are stored, resides on the OS SSD and will be erased.
2. Load the USB flash drive or DVD-ROM into the DGX Station. ‣ If you are using a USB flash drive, plug it into one of the USB ports of the DGX Station. ‣ If you are using a DVD-ROM, connect an external optical drive to the DGX Station and load the DVD-ROM into the drive.
11.In the Folder list, use the up arrow and down arrow keys to select the BIOS file. 12.Press Enter to start the BIOS update process. CAUTION: To avoid the risk of leaving your DGX Station unable to boot, do not shut down or reset the DGX Station during the BIOS update process.
Server Settings 2. Click the NVIDIA X Server Settings icon. 3. Under each GPU in the list of GPUs in the NVIDIA X Server Settings window, click Thermal Settings. Thermal sensor information for the GPU is displayed, including its current temperature and an indication of whether the temperature is within the GPU's operating range.
1. Remove the side panel on the right of the DGX Station when viewed from the rear. a). Push the button on the right side of the DGX Station back panel to release the panel. ...
Page 79
Replenishing the Liquid in the GPU Cooling System. 3. Replace the side panel of the DGX Station. a). Align the bottom edge of the side panel with the bottom edge of the DGX Station. b). Firmly push the panel back into place to re-engage the latch.
CAUTION: Use only the coolant that is supplied with the kit. Do not use any other type of coolant. Use of other types of coolant will void the DGX Station hardware warranty and may cause damage to or impair the performance of the system.
Page 81
7. Power on the DGX Station and let it run for one minute. If the pump makes a grinding noise, power off and power on the DGX Station four times. 8. Ensure that the level of the liquid in the cooling system is at the Maximum Level in the reservoir.
Page 82
Check the level of the liquid in the cooling system. 9. Power off the DGX Station. 10.Replace the side panel of the DGX Station. a). Align the bottom edge of the side panel with the bottom edge of the DGX Station. ...
Page 83
Maintaining and Servicing the NVIDIA DGX Station DGX Station DU-08255-001 _v4.6 | 77...
To reduce the risk of bodily injury, electrical shock, fire, and equipment damage, read this document and observe all warnings and precautions in this guide before installing or maintaining your product. NVIDIA products are designed to operate safely when installed and used according to the product instructions and general safety practices. The guidelines included in this document explain the potential risks associated with computer operation and provide important safety practices designed to minimize these risks.
Follow all cautions and instructions marked on the equipment. Do not attempt to defeat safety interlocks (where provided). ‣ Operate the DGX Station in a place where the temperature is always in the range 10°C to 30°C (50°F to 86°F). A.3. ...
Do not connect communications cables during an electrical storm. There may be a risk of electric shock from lightning. ‣ Do not connect or use communications cables in a wet location. ‣ Disconnect the communications cables before opening a product enclosure, or touching or installing internal components. DGX Station DU-08255-001 _v4.6 | 80...
Nickel The decorative metal foam on the DGX Station casework contains some nickel. The metal foam is not intended for direct and prolonged skin contact. While nickel exposure is unlikely to be a problem, you should be aware of the possibility in case you’re susceptible to nickel- related reactions.
Appendix B. Connections, Controls, and Indicators B.1. Front-Panel Connections and Controls Type Description Power Button Press to turn the DGX Station on or off B.2. Rear-Panel Connections and Controls Current Units Type Description USB 3.1 Type-C USB 3.1 Type-C port Ethernet...
Page 89
Turn the power supply on and off Earlier Units Type Description USB 3.1 Type-C USB 3.1 Type-C port Ethernet 10G LAN ports (see LAN Port Indicators): ‣ Lower port: LAN 1 ‣ Upper port: LAN 2 USB 3.0 USB 3.0 ports DGX Station DU-08255-001 _v4.6 | 83...
Ports for connecting up to 3 displays AC Input Power supply input B.3. LAN Port Indicators LEDs on each Ethernet LAN port indicate the connection status as illustrated in the following figure and described in the following tables. DGX Station DU-08255-001 _v4.6 | 84...
Mic In Mic In Black Rear Speaker Rear Speaker Rear Speaker Orange Center/ Center/ Subwoofer Subwoofer Light Blue Line In Line In Line In Side Speaker Lime Green Line Out Front Speaker Front Speaker Front Speaker DGX Station DU-08255-001 _v4.6 | 85...
Page 92
Connections, Controls, and Indicators DGX Station DU-08255-001 _v4.6 | 86...
Appendix C. Compliance The NVIDIA DGX Station is compliant with the regulations listed in this section. C.1. DGX Station Model Number Model: P2587 C.2. Argentina S-Mark C.3. Australia/New Zealand DGX Station DU-08255-001 _v4.6 | 87...
Innovation, Science and Economic Development Canada (ISED) CAN ICES-3(A)/NMB-3(A) The Class A digital apparatus meets all requirements of the Canadian Interference-Causing Equipment Regulation. Cet appareil numérique de la classe A respecte toutes les exigences du Règlement sur le matériel brouilleur du Canada. DGX Station DU-08255-001 _v4.6 | 88...
RoHS Directive (2011/65/EU) for hazardous substances. ‣ ErP Directive (2009/125/EC) for European Ecodesign. A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA GmbH (Floessergasse 2, 81369 Munich, Germany). DGX Station DU-08255-001 _v4.6 | 90...
Compliance C.10. Japan VCCI C.11. Russia CU-TR C.12. South Africa Compliant with SANS IEC 60950 SABS Compliant with SANS 222 CISPR 22 DGX Station DU-08255-001 _v4.6 | 92...
NOTE: This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a DGX Station DU-08255-001 _v4.6 | 94...
Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense. C.16. United States/Canada cULus Listing Mark C.17. Vietnam DGX Station DU-08255-001 _v4.6 | 95...
Input Comments 115 - 240 VAC, 12-8A, The DGX Station power consumption can reach 1,500 W (ambient (50 - 60 Hz) temperature 30°C) with all system resources under a heavy load. Be aware of your electrical source’s power capability to avoid overloading the circuit.
Appendix E. Customer Support for the NVIDIA DGX Station There are several options for contacting NVIDIA Customer Support for assistance reporting, troubleshooting, or diagnosing problems with your DGX Station. NVIDIA Enterprise Support Portal The best way to file an incident is to log on to NVIDIA Enterprise Support (https://nvid.nvidia.com/dashboard/).
Page 105
NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.