Hpc server - 2u dp sxm4 a100 4 gpu server liquid cooling solution (141 pages)
Summary of Contents for Gigabyte G262-IR0
Page 1
G262-IR0 HPC Server - NVIDIA HGX A100 4-GPU - 3rd Gen. Intel® Xeon® Scalable GPU Server User Manual Rev. 1.0...
Page 2
GIGABYTE's prior written permission. Documentation Classifications In order to assist in the use of this product, GIGABYTE provides the following types of documentation: User Manual: detailed information & steps about the installation, configuration and use of this product (e.g.
Page 3
Conventions The following conventions are used in this user's guide: NOTE! Gives bits and pieces of additional information related to the current topic. CAUTION! Gives precautionary measures to avoid possible hardware or software problems. WARNING! Alerts you to any damage that might result from doing or not doing specific actions.
Page 4
Server Warnings and Cautions Before installing a server, be sure that you understand the following warnings and cautions. WARNING! To reduce the risk of electric shock or damage to the equipment: • Do not disable the power cord grounding plug. The grounding plug is an important safety feature.
Page 5
Electrostatic Discharge (ESD) CAUTION! ESD CAN DAMAGE DRIVES, BOARDS, AND OTHER PARTS. WE RECOMMEND THAT YOU PERFORM ALL PROCEDURES AT AN ESD WORKSTATION. IF ONE IS NOT AVAILABLE, PROVIDE SOME ESD PROTECTION BY WEARING AN ANTI-STATIC WRIST STRAP AT- TACHED TO CHASSIS GROUND -- ANY UNPAINTED METAL SURFACE -- ON YOUR SERVER WHEN HANDLING PARTS.
Page 6
CAUTION! Risk of explosion if battery is replaced incorrectly or with an incorrect type. Replace the battery only with the same or equivalent type recommended by the manufacturer. Dispose of used bat- teries according to the manufacturer’s instructions.
Table of Contents Chapter 1 Hardware Installation ................... 11 Installation Precautions .................. 11 Product Specifications ..................12 System Block Diagram ................... 15 Chapter 2 System Appearance ..................16 Front View ...................... 16 Rear View ....................... 17 Front Panel LED and Buttons ................ 18 Front System LAN LEDs ................
Page 8
Chapter 5 BIOS Setup ....................43 The Main Menu ....................45 Advanced Menu ..................... 48 5-2-1 Trusted Computing ....................49 5-2-2 Serial Port Console Redirection ................50 5-2-3 SIO Configuration ....................54 5-2-4 PCI Subsystem Settings ..................55 5-2-5 USB Configuration ....................56 5-2-6 Network Stack Configuration ..................57 5-2-7 Post Report Configuration ..................58 5-2-8...
Chapter 1 Hardware Installation Installation Precautions The motherboard/system contain numerous delicate electronic circuits and components which can become damaged as a result of electrostatic discharge (ESD). Prior to installation, carefully read the user manual and follow these procedures: • Prior to installation, do not remove or break motherboard S/N (Serial Number) sticker or warranty sticker provided by your dealer.
Product Specifications NOTE: We reserve the right to make any changes to the product specifications and product-related informa- tion without prior notice. Dual 3rd Generation Intel® Xeon® Scalable Processors Intel® Xeon® Platinum Processor, Intel® Xeon® Gold Processor, Intel® Xeon® Silver Processor 10nm technology, CPU TDP up to 270W ...
Page 11
Front I/O 2 x USB 3.0 1 x D-Sub VGA 1 x MLAN 1 x Power button with LED Socket Secur i t y Ser ver Oper ating P r oper t i es 1 x ID button with LED ...
Page 12
System Aspeed® AST2600 management controller Management GIGABYTE Management Console (AMI MegaRAC SP-X) web interface Dashboard cket Secur i t y Ser ver Oper ating P r oper t i es HTML5 KVM Sensor Monitor (Voltage, RPM, Temperature, CPU Status …etc.) ...
Chapter 2 System Appearance Front View HDD0 HDD2 HDD1 HDD3 7 6 5 Description Front Panel LEDs and Buttons PCIe Card Slot PCIe Card Slot ID Button with LED USB 3.0 Port x 2 10/100/1000 Server Management LAN Port Mezzanine Card Slot (Option/OCP3) The Green HDD Latches Support NVMe •...
Front Panel LED and Buttons Name Color Status Description Reset Button Press the button to reset the system. Green Indicates the system is powered on. Power button Green Blink System is in ACPI S1 state (sleep mode). with LED • System is not powered on or in ACPI S5 state (power off) •...
Front System LAN LEDs Name Color Status Description Yellow 1 Gbps data rate 1GbE Green 100 Mbps data rate Speed LED 10 Mbps data rate Link between system and 1GbE Green network or no access Link/ Blink Data transmission or receiving is occurring Activity No data transmission or receiving is occurring...
Power Supply Unit (PSU) LED PSU LED Color Status Description No AC power to all power supplies AC present / only standby on / Cold redundant mode Green 1Hz Blinking Green 2Hz Blinking Power supply firmware updating mode AC cord unplugged or AC power lost; with a second power supply in parallel still with AC input power Amber Power supply critical event causing shut down:...
Hard Disk Drive LEDs LED#1 LED#2 HDD Present RAID SKU HDD Fault Rebuilding LED1 Locate Access (No Access) Disk LED ON(*1) BLINK (*2) Green (LED on Back Panel) Amber No RAID configuration ON(*1) Green (via HBA) Removed HDD Slot (LED on Back Panel) Amber BLINK (*2) Green...
Chapter 3 System Hardware Installation Pre-installation Instructions System components and electronic circuit boards can be damaged by discharges of static electricity. Working on systems that are still connected to a power supply can be extremely dangerous. Follow the simple guidelines below to avoid damage to your system or injury to yourself. •...
Removing and Installing the Top Rear Cover Before you remove the top rear cover: • Make sure the system is not turned on or connected to AC power. Follow these instructions to remove the top rear cover: Remove the two screws on the sides of the top cover. Unlock the plastic handle and pull the grip handle to open the panel cover.
Removing and Installing the Fan Duct Follow these instructions to remove/install the fan duct: GPU Fan Duct: Remove the screws securing the mental fan duct. Lift up to remove the fan duct. To install the fan duct, align the fan duct with the guiding groove. Push down the fan duct into chassis until its firmly seats.
Removing the Heat Sink Follow these instructions to remove the heat sink: Loosen the captive screws securing the heatsink in place in reverse order (4g3g2g1). Lift and remove the heat sink from the system. To reinstall the heat sink reverse steps 1-2 while ensuring that you tighten the captive screws in sequential order (1g2g3g4) as seen in the image below.
Installing the CPU Read the following guidelines before you begin to install the CPU: • • Make sure that the motherboard supports the CPU. • • Always turn off the computer and unplug the power cord from the power outlet before installing the CPU to prevent hardware damage.
Installing Memory • Read the following guidelines before you begin to install the memory: • • Make sure that the motherboard supports the memory. It is recommended that memory of the same capacity, brand, speed, and chips be used. • • Always turn off the computer and unplug the power cord from the power outlet before installing the memory to prevent hardware damage.
3-5-2 Installing the Memory Before installing a memory module, make sure to turn off the computer and unplug the power cord from the power outlet to prevent damage to the memory module. Be sure to install DDR4 DIMMs on this motherboard. Follow these instructions to install the Memory: Insert the DIMM memory module vertically into the DIMM slot, and push it down.
3-5-4 Processor and Memory Module Matrix Table CPU0 CPU1 Memory Q’ty for each CPU 1 DIMM 2 DIMM 4 DIMM 6 DIMM 8 DIMM NOTE! l There should be at least one DDR4 DIMM per socket. l If only one DIMM is populated in a channel, then populate it in the slot furthest away from CPU of that channel. l Channel 0's on each memory controller (A/E/C/G, I/M/K/O) must be populated with same total capacity per channel (if populated).
Installing the PCI Expansion Card • • Voltages can be present within the server whenever an AC power source is connected. This voltage is present even when the main power switch is in the off position. Ensure that the system is powered-down and all power sources have been disconnected from the server prior to installing a PCI card.
Replacing the System FAN Module CAUTION! Before you remove or install the system fans follow these steps: • Make sure the system is not turned on or connected to the AC power. • Disconnect all necessary cable connections. Failure to observe these warnings could result in personal injury or damage to the equipment.
Installing the Hard Disk Drive • Read the following guidelines before you begin to install the Hard disk drive: • Take note of the drive tray orientation before sliding it out. • The tray will not fit back into the bay if inserted incorrectly. •...
Removing and Installing the Power Supply CAUTION! • In order to reduce the risk of injury from electric shock, disconnect AC power from the power supply before removing the power supply from the system. • Please see Section 2-2 "Rear View" for installation sequence. Follow these instructions to replace the power supply: Flip up and then grasp the power supply handle.
Jumper Setting MB_SW Stop initial power on S3_MASK Normal [Default] when BMC is not ready PMBUS_SEL BMC [Default] SMB_SEL PCH [Default] ME_UPDATE Force ME update Normal [Default] Default Enable Default Enable Default Enable Default Enable BIOS Password Force Recovery Clear CMOS Clear Update 1 2 3...
Chapter 5 BIOS Setup BIOS (Basic Input and Output System) records hardware parameters of the system in the EFI on the motherboard. Its major functions include conducting the Power-On Self-Test (POST) during system startup, saving system parameters, loading the operating system etc. The BIOS includes a BIOS Setup program that allows the user to modify basic system configuration settings or to activate certain system features.
Page 42
Main This setup page includes all the items of the standard compatible BIOS. Advanced This setup page includes all the items of AMI BIOS special enhanced features. (ex: Auto detect fan and temperature status, automatically configure hard disk parameters.) ...
The Main Menu Once you enter the BIOS Setup program, the Main Menu (as shown below) appears on the screen. Use arrow keys to move among the items and press <Enter> to accept or enter other sub-menu. Main Menu Help The on-screen description of a highlighted setup option is displayed on the bottom line of the Main Menu.
Page 44
Parameter Description BIOS Information Project Name Displays the project name information. Project Version Displays version number of the BIOS setup utility. Build Date and Time Displays the date and time when the BIOS setup utility was created. BMC Information (Note1) BMC Firmware Version Displays BMC firmware version information.
Page 45
Parameter Description Memory Frequency (Note2) Displays the frequency information of the installed memory. CPLD Boot Information Boot Status Displays the boot status information. System Date Sets the date following the weekday-month-day-year format. System Time Sets the system time following the hour-minute-second format. (Note2) This section will display capacity and frequency information of the memory that the customer has installed.
Advanced Menu The Advanced Menu displays submenu options for configuring the function of various hardware components. Select a submenu item, then press <Enter> to access the related submenu screen. BIOS Setup - 48 -...
5-2-1 Trusted Computing Parameter Description Configuration Enable/Disable BIOS support for security device. OS will not show security device. TCG EFI protocol and INT1A interface will not be Security Device Support available. Options available: Enable, Disable. Default setting is Enable. BIOS Setup - 49 -...
5-2-2 Serial Port Console Redirection Parameter Description Console redirection enables the users to manage the system from a COM1 Console remote location. Redirection (Note) Options available: Enabled, Disabled. Default setting is Disabled. Press [Enter] to configure advanced items. Please note that this item is configurable when COM1 Console Redirection is set to Enabled.
Page 49
Parameter Description Parity – A parity bit can be sent with the data bits to detect some transmission errors. – Even: parity bit is 0 if the num of 1's in the data bits is even. – Odd: parity bit is 0 if num of 1's in the data bits is odd. –...
Page 50
Parameter Description Legacy Console Redirection Press [Enter] to configure advanced items. Redirection COM Port – Selects a COM port for Legacy serial redirection. – Default setting is COM1. Resolution – Selects the number of rows and columns used in Console Redirection for legacy OS support.
Page 51
Parameter Description Flow Control EMS – Flow control can prevent data loss from buffer overflow. When sending data, if the receiving buffers are full, a 'stop' signal can Serial Port for Out-of-Band be sent to stop the data flow. Once the buffers are empty, a 'start' EMS Console Redirection signal can be sent to re-start the flow.
5-2-3 SIO Configuration Description Parameter Displays the AMI SIO driver version information. AMI SIO Driver Version Super IO Chip Logical Device(s) Configuration Press [Enter] to configure advanced items. Use This Device – When set to Enabled allows you to configure the serial port settings. When set to Disabled, displays no configuration for the serial port.
5-2-4 PCI Subsystem Settings Parameter Description PCI Bus Driver Version Displays the PCI Bus Driver version information. Change the PCIe lanes. Slot_# Lanes Configuration (Note1) Options available: Disabled, Auto, x16, x8x8, x8x4x4, x4x4x8, OCP 3.0 Lanes Configuration x4x4x4x4. Default setting is Auto. When enabled, this setting will initialize the device expansion Slot_# I/O ROM (Note1)
5-2-5 USB Configuration Parameter Description USB Configuration USB Devices: Displays the USB devices connected to the system. Enable/Disable the XHCI (USB 3.0) Hand-off support. XHCI Hand-off Options available: Enabled, Disabled. Default setting is Enabled. USB Mass Storage Driver Enable/Disable the USB Mass Storage Driver Support. Support Options available: Enabled, Disabled.
5-2-9 Chipset Configuration Parameter Description Defines the power state to resume to after a system shutdown that is due to an interruption in AC power. When set to Last State, the system will return to the active power state prior to shutdown. When set to Restore on AC Power Loss (Note) Power Off, the system remains off after power shutdown.
5-2-10 Tls Auth Configuration Parameter Description Press [Enter] for configuration of advanced items. Enroll Cert – Press [Enter] to enroll a certificate • Enroll Cert Using File • Cert GUID Server CA Configuration Input digit character in 1111111-2222-3333-4444-1234567890ab format. –...
Chipset Menu Chipset Setup menu displays submenu options for configuring the function of Platform Controller Hub(PCH). Select a submenu item, then press <Enter> to access the related submenu screen. BIOS Setup - 63 -...
Description Parameter Processor Configuration Press [Enter] to configure advanced items. CPU Socket 0/1 Configuration – Core Disable Bitmap(Hex) • Number of Cores to enable. 0 means all cores. FFFFFFF Pre-Socket Configuration means to disable all cores. The maximum value depends on the number of CPUs available.
Page 64
Description Parameter Options available: Enable, Disable. Default setting is Disable. Debug Consent Enable/Disable total memory encryption (TME). Total Memory Encryption (TME) Options available: Enabled, Disabled. Default setting is Disabled. BIOS Setup - 66 -...
5-3-2 Common RefCode Configuration Parameter Description Common RefCode Configuration Selects the MMIO High Base setting. MMIO High Base Options available: 56T, 40T, 32T, 24T, 16T, 4T, 2T, 1T, 512G, 3584T. Default setting is 56T. Selects the allocation size used to assign memory-mapped I/O (MMIO) resources.
Page 66
Parameter Description Divide physical NUMA nodes into evenly sized virtual NUMA nodes in ACPI table. This may improve Windows performance on CPUs Virtual Numa with more than 64 logical processors. Options available: Enable, Disable. Default setting is Disable. UMA Based Clustering option include Disable (ALL2ALL), Hemisphere (2cluster), and Quardrant ( cluster, not supported on UMA-Based Clustering ICX).
5-3-3 UPI Configuration Description Parameter Press [Enter] to configure advanced items. Uncore Status – Press [Enter] to view the Uncore status. Link Frequency Select – Selects the UPI link frequency. – Options available: 9.6GT/s, 10.4GT/s, 11.2GT/s, Auto. Default setting is Auto.
5-3-4 Memory Configuration Description Parameter Integrated Memory Controller (iMC) When set to Enable, the system enforces Plan Of Record restrictions for DDR4 frequency and voltage programming. Enforce POR Options available: POR, Disable. Default setting is Disable. Configures the maximum memory frequency. If Enforce POR is disabled, user will be able to run at higher frequencies than the Memory Frequency memory support (limited by processor support).
Page 69
Description Parameter Enable/Disable Automatic restoring of NVDIMMs. Restore NVDIMMs Options available: Enable, Disable. Default setting is Enable. Controls if NVDIMMs are interleaved together or not. Interleave NVDIMMs Options available: Enable, Disable. Default setting is Enable. Enable/Disable Assert ADR on Reset. Assert ADR on Reset Options available: Enabled, Disabled.
Page 70
Description Parameter Leaky bucket time window based interface – Enable/Disable leaky bucket time window based interface. – Options available: Disabled, Enabled. Default setting is Disabled. Leaky bucket low bit – Configures leaky bucket low bit (1-63). – Press the <+> / <-> keys to increase or decrease the desired values.
5-3-5 IIO Configuration Description Parameter IIO Configuration Press [Enter] to configure advanced items. ® Intel VT for Directed I/O – Enable/Disable the Intel VT for Directed I/O (VT-d) support function by reporting the I/O device assignment to VMM through DMAR ACPI Tables.
5-3-6 Advanced Power Management Configuration Description Parameter Advanced Power Management Configuration Press [Enter] to configure advanced items. SpeedStep (Pstates) – Conventional Intel SpeedStep Technology switches both voltage and frequency in tandem between high and low levels in response to processor load. –...
Page 74
Description Parameter Press [Enter] to configure advanced items. Hardware P-States – When this item is disabled, the processor hardware chooses a P-state based on OS Request (Legacy P-States). – In Native mode, the processor hardware chooses a P-state based Hardware PM State Control on OS guidance.
5-3-7 PCH Configuration (Note 1) Only appears when HDD sets to RAID Mode. BIOS Setup - 77 -...
Page 76
Description Parameter PCH Configuration Press [Enter] to configure advanced items. SATA Controller – Enable/Disable SATA controller. – Options available: Enable, Disable. Default setting is Enable. Configure SATA as – Configures on chip SATA type. – AHCI Mode: When set to AHCI, the SATA controller enables its AHCI functionality.
Page 77
sSATA Controller – Enable/Disable sSATA controller. – Options available: Enable, Disable. Default setting is Enable. Configure sSATA as – Configures on chip SATA type. – AHCI Mode: When set to AHCI, the SATA controller enables its AHCI functionality. Then the RAID function is disabled and cannot be access the RAID setup utility at boot time.
5-3-8 Miscellaneous Configuration Description Parameter Miscellaneous Configuration Selects the active video type. Options available: Auto, Onboard Device, PCIE Device, Specific PCIE Active Video Device. Default setting is Auto. BIOS Setup - 80 -...
5-3-9 Server ME Configuration Parameter Description General ME Configuration Oper. Firmware Version Displays the operational firmware version. ME Firmware Status #1/#2 Displays ME Firmware status information. Current State Displays ME Firmware current status information. Error Code Displays ME Firmware status error code. Recovery Cause Displays ME Firmware recovery cause.
5-3-11 Power Policy Description Parameter Selects a Power Policy Quick Setting. Options available: Standard, Best Performance, Energy Efficient, Turbo Power Policy Quick Settings Lock. Default setting is Standard. Conventional Intel SpeedStep Technology switches both voltage and frequency in tandem between high and low levels in response to processor SpeedStep (Pstates) load.
Page 83
Description Parameter The Hyper Threading Technology allows a single processor to execute two or more separate threads concurrently. When hyper-threading is enabled, multi-threaded software applications can execute their threads, Hyper-Threading [ALL] thereby improving performance. Options available: Enabled, Disabled. Default setting is Enabled. Options available: Enabled, Disabled.
Server Management Menu Parameter Description Enable/Disable FRB-2 timer (POST timer). FRB-2 Timer Options available: Enabled, Disabled. Default setting is Disabled. FRB-2 Timer Configures the FRB2 Timer timeout. The value is between 1 to 30 minutes. (Note1) timeout Default setting is 6 minutes. Configures the FRB2 Timer policy.
Page 85
Parameter Description System Event Log Press [Enter] to configure advanced items. View FRU Press [Enter] to view the FRU information. Information BMC VLAN Press [Enter] to configure advanced items. Configuration BMC network Press [Enter] to configure advanced items. Configuration IPv6 BMC Network Press [Enter] to configure advanced items.
5-4-1 System Event Log Parameter Description Enabling / Disabling Options Change this item to enable or disable all features of System Event SEL Components Logging during boot. Options available: Enabled, Disabled. Default setting is Enabled. Erasing Settings Choose options for erasing SEL. Options available: No, Erase SEL Yes, On next reset,...
5-4-2 View FRU Information The FRU page is a simple display page for basic system ID information, as well as System product information. Items on this window are non-configurable. (Note) The model name will vary depends on the product you purchased BIOS Setup - 89 -...
5-4-3 BMC VLAN Configuration Description Parameter BMC VLAN Configuration Select to configure BMC VLAN ID. The valid range is from 0 to 4094. When BMC VLAN ID set to 0, BMC VLAN ID will be disabled. Select to configure BMC VLAN Priority. The valid range is from 0 to 7. BMC VLAN Priority When BMC VLAN ID is set to 0, BMC VLAN Priority will not be selected.
5-4-4 BMC Network Configuration Parameter Description BMC network configuration Lan Channel 1 Selects to configure LAN channel parameters statically or dynamically (DHCP). Do nothing option will not modify any BMC network parameters Configuration Address source during BIOS phase. Options available: Unspecified, Static, DynamicBmcDhcp. Default setting is DynamicBmcDhcp.
5-4-5 IPv6 BMC Network Configuration Parameter Description IPv6 BMC network configuration IPv6 BMC Lan Channel 1 Enable/Disable IPv6 BMC LAN channel function. When this item is disabled, the system will not modify any BMC network during BIOS IPv6 BMC Lan Option phase.
Security Menu The Security menu allows you to safeguard and protect the system from unauthorized use by setting up access passwords. There are two types of passwords that you can set: • Administrator Password Entering this password will allow the user to access and change all settings in the Setup Utility. •...
5-5-1 Secure Boot The Secure Boot submenu is applicable when your device is installed the Windows 8 (or above) operating ® system. Parameter Description System Mode Displays if the system is in User mode or Setup mode. Enable/ Disable the Secure Boot function. Secure Boot Options available: Enabled, Disabled.
Page 93
Parameter Description Press [Enter] to configure advanced items. Please note that this item is configurable when Secure Boot Mode is set to Custom. Factory Key Provision – Allows to provision factory default Secure Boot keys when system is in Setup Mode.
Page 94
Parameter Description Authorized TimeStamps (DBT) – Displays the current status of the Authorized TimeStamps Database. – Press [Enter] to configure a new DBT or load additional DBT from storage devices. Key Management – Options available: Update, Append. (continued) OsRecovery Signatures ...
Boot Menu The Boot menu allows you to set the drive priority during system boot-up. BIOS setup will display an error message if the legacy drive(s) specified is not bootable. Parameter Description Boot Configuration Number of seconds to wait for setup activation key. 65535 (0xFFFF) Setup Prompt Timeout means indefinite waiting.
Page 96
Parameter Description FIXED BOOT ORDER Priorities Press [Enter] to configure the boot order priority. By default, the server searches for boot devices in the following sequence: Hard drive. Boot Option #1 / #2 / #3 / #4 / #5 CD-COM/DVD drive. USB device.
Save & Exit Menu The Save & Exit menu displays the various options to quit from the BIOS setup. Highlight any of the exit options then press <Enter>. Parameter Description Save Options Saves changes made and closes the BIOS setup. Save Changes and Exit Options available: Yes, No.
Page 98
Parameter Description Loads the default settings for all BIOS setup parameters. Setup Defaults are quite demanding in terms of resources consumption. If you are using low-speed memory chips or other kinds of low-performance components Restore Defaults and you choose to load these settings, the system might not function properly.
BIOS POST Beep code (AMI standard) 5-8-1 PEI Beep Codes # of Beeps Description Memory not Installed. Memory was installed twice (InstallPeiMemory routine in PEI Core called twice) Recovery started DXEIPL was not found DXE Core Firmware Volume was not found Recovery failed S3 Resume failed Reset PPI is not available...