How to Use This Guide This guide includes detailed information on the AI Server, including how to install components and maintain the system. To deploy this device effectively and ensure trouble-free operation, you should first read the relevant sections in this guide so that you are familiar with all the features.
Page 4
How to Use This Guide Conventions The following conventions are used throughout this guide to show information: Note: Emphasizes important information or calls your attention to related features or instructions. Caution: Alerts you to a potential hazard that could cause loss of data, or damage the system or equipment.
Contents How to Use This Guide Contents Figures Tables 1 System Overview System Specifications Front Panel of the AGS8200 Rear Panel of the AGS8200 System Status LEDs and Buttons GPU Module QSFP-DD Port LEDs Removing the GPU Module CPU Module...
Page 6
Contents 2 Device Installation Installation Precautions Rack Mount Guidelines Rack Cooling Device Cooling Requirements Package Contents Installing the Device in a Rack Connecting Power 3 Device Connections Connecting to QSFP-DD/QSFP28 Ports Inserting Transceivers Connecting to Fiber Optic Ports Connecting to the VGA Ports Connecting to the Console Port Connecting to the 1000BASE-T MGMT Port 4 Troubleshooting...
Figures Figure 1: System Overview Figure 2: Front Panel Features Figure 3: Rear Panel Features Figure 4: System LEDs and Buttons Figure 5: QSFP-DD Port LEDs Figure 6: Removing the GPU Module Figure 7: MGMT Port LEDs Figure 8: UID Button/LED Figure 9: OCP Module Port LEDs Figure 10: Removing the CPU Module Figure 11: HDD/SSD Drive Bay LEDs...
Tables Table 1: Hardware Specifications Table 2: System Buttons/LEDs Table 3: QSFP-DD Port Status LEDs Table 4: MGMT Port Status LED Table 5: UID Button/LED Table 6: OCP Module Port Status LEDs Table 7: HDD/SSD Drive Bay LEDs Table 8: System PSU Status LED Table 9: GPU Module PSU Status LED Table 10: Fan Tray Status LED Table 11: Console Cable Wiring...
System Overview This chapter includes the following sections: “System Specifications” on page 10 “Front Panel of the AGS8200” on page 12 “Rear Panel of the AGS8200” on page 13 “System Status LEDs and Buttons” on page 14 ...
System Specifications System Specifications The AGS8200 system is designed with eight Intel® Gaudi® 2 AI Accelerators and dual Xeon® Sapphire-Rapids processors. The Gaudi® 2 AI Accelerator integrates 96GB HBM2E memory and 24 NICs of 100 Gbps RoCEv2 RDMA. The 24 x 100 Gbps NICs offer all-to-all connectivity and scale-out internally and externally for training, fine-tuning, and other deep learning processing.
Chapter 1 | System Overview System Specifications Table 1: Hardware Specifications (Continued) Item Specification Storage Internal: 2 x M.2 SATA SSD 480 GB (total 960 GB) Front: 16x 960GB 2.5" SATA SSD + 8x 1920GB 2.5" NVMe Management I/O BMC Chip: AST2600 Front: 2 x USB 2.0, 1 x VGA, 1 x UID, 1 x PWR Rear: 2 x USB 3.0, 1 x VGA, 1 x RJ-45, 1 x UID Expansion Slots...
| System Overview Front Panel of the AGS8200 Front Panel of the AGS8200 The front panel of the AGS8200 provides system LEDs and buttons, as well as access to fan trays and HDD/SSD drive bays. Figure 2: Front Panel Features...
| System Overview Rear Panel of the AGS8200 Rear Panel of the AGS8200 The rear panel of the AGS8200 provides access to the GPU QSFP-DD ports, all the server ports, and system PSUs. Figure 3: Rear Panel Features GPU module...
Chapter 1 | System Overview System Status LEDs and Buttons System Status LEDs and Buttons The top-left side of the front panel includes the system LEDs and buttons. Figure 4: System LEDs and Buttons Power button/LED System Fault LED UID button/LED Power Fault LED Reset button System Fan LED...
Page 15
Chapter 1 | System Overview System Status LEDs and Buttons Table 2: System Buttons/LEDs (Continued) Condition Status System LED Solid Red Critical system error detected. Normal operation. Power LED Solid Red System power failure detected. Normal operation. System Fan LED Solid Red Fan fault detected.
Chapter 1 | System Overview GPU Module GPU Module The GPU module contains eight Intel® Gaudi® 2 AI Accelerators with six 400G (4 x 100G PAM4) QSFP-DD ports on the panel for a total 2.4 Tbps of external scale-out bandwidth. QSFP-DD Port LEDs Each QSFP-DD port includes four status LEDs, each LED indicates the status of each 100G connection.
Chapter 1 | System Overview CPU Module Figure 6: Removing the GPU Module GPU module release levers CPU Module The CPU module includes PSUs for the CPU and GPU modules, VGA, console, management, and USB ports, plus eight PCIe slots and an OCP module slot for expansion.
Chapter 1 | System Overview CPU Module Table 4: MGMT Port Status LED Condition Status Link LED Solid Green 1000M link up. Solid Amber 10/100M link up. No network link. Activity LED Blinking Green Network activity. No network activity. UID Button/LED The UID (Unit ID) button on the CPU module is duplicated on the system front panel (see “System Status LEDs and Buttons”...
Chapter 1 | System Overview CPU Module OCP Module Port The OCP module include two 100G QSFP28 ports. LEDs Each 100G QSFP28 port includes two status LEDs. Figure 9: OCP Module Port LEDs Link LED Activity LED Table 6: OCP Module Port Status LEDs Condition Status Link LED...
Chapter 1 | System Overview HDD Module Figure 10: Removing the CPU Module Module release levers HDD Module The HDD Module includes 24 drive bays for 2.5-inch HDD/SSD or NVMe drives in the following configurations: 16 x HDD/SSD + 8 x NVMe ...
Chapter 1 | System Overview HDD Module Table 7: HDD/SSD Drive Bay LEDs Condition Status Fault LED Blinking Red A drive error has occured. The drive is operating normally. Locate LED Solid Blue Drive locator is active. Drive locator is not active. Activity LED Solid Green An NVMe drive is present.
Figure 13: Removing the HDD Module Module release levers Power Supplies The AGS8200 includes two system 2700 W AC power supply units (PSUs), and the GPU module six 3000 W AC PSUs. System PSUs The CPU module has dual 1+1 redundant, hot-swappable PSUs that include a single status LED.
Chapter 1 | System Overview Power Supplies Table 8: System PSU Status LED Condition Status PSU LED Solid Green The PSU is operating normally. 1Hz Blinking Green AC present. Only 12V standby on (power supply off) or PSU in Smart standby mode. 2Hz Blinking Green PSU firmware updating.
Chapter 1 | System Overview Power Supplies Replacing Power The CPU module has dual 1+1 redundant, hot-swappable 2700 W PSUs and the Supplies GPU module has six 3+3 redundant, hot-swappable 3000 W PSUs. The procedure for replacing a PSU is the same for both PSU types. The device does not need to be powered off before replacing a PSU.
Fan Trays Fan Trays The AGS8200 includes 15 hot-swappable fan trays for system cooling. At least 14 fan trays must be installed at all times. If a fan should fail, the fan tray should be replaced as soon as possible.
Chapter 1 | System Overview Fan Trays Figure 18: Replacing Fan Trays Follow this procedure to replace a fan tray: Press the latch on the fan tray. Using the fan tray handle, pull firmly until the fan tray disengages from the connector.
Page 27
Chapter 1 | System Overview Fan Trays – 27 –...
Device Installation This chapter includes the following sections: “Installation Precautions” on page 29 “Rack Mount Guidelines” on page 29 “Device Cooling Requirements” on page 30 “Installing the Device in a Rack” on page 31 “Connecting Power” on page 33 ...
Chapter 2 | Device Installation Installation Precautions Installation Precautions Warning: This device uses transceivers with lasers to transmit signals over fiber optic cable. The lasers are compliant with the requirements of a Class 1 Laser Product and are inherently eye safe in normal operation. However, you should never look directly into the laser light emitted by a transceiver.
Chapter 2 | Device Installation Device Cooling Requirements Rack Cooling When mounting the device in an enclosed rack or cabinet, be sure to check the following guidelines to prevent overheating: Make sure that enough cool air can flow into the enclosure for the equipment it ...
Package Contents Package Contents After unpacking the device, check the package contents to be sure you have received all the items. AGS8200 AI Server Rack Mounting Kit—contains two (left and right) mounting brackets 18 x M6x12 rack screws ...
Chapter 2 | Device Installation Installing the Device in a Rack Figure 20: Removing the Transport Screws Transport screws Following your rack plan, mark the holes in the rack where the device will be installed. Install the device’s rack-mounting brackets in the rack and secure each bracket using four rack screws at the back and three at the front.
Chapter 2 | Device Installation Connecting Power Figure 22: Installing the Device in a Rack Rack screws Connecting Power To supply AC power to the device, first verify that the external AC power supply can provide 90 to 264 VAC, 50/60 Hz. The device requires two 2700 W AC PSUs to be installed for the CPU module and six 3000 W AC PSUs for the GPU module.
Page 34
Chapter 2 | Device Installation Connecting Power Insert the plug on the other end of the power cord directly into the socket on the AC PSU. Check the LED indicator on the PSU to verify that power is being received. If not, recheck the PSU and power cord connections at the AC supply source and PSU.
Device Connections This chapter includes the following sections: “Connecting to QSFP-DD/QSFP28 Ports” on page 37 “Connecting to the VGA Ports” on page 39 “Connecting to the Console Port” on page 39 “Connecting to the 1000BASE-T MGMT Port” on page 41 ...
Chapter 3 | Device Connections Connecting to QSFP-DD/QSFP28 Ports Connecting to QSFP-DD/QSFP28 Ports The device includes 6 QSFP-DD slots for 400 Gbps QSFP-DD transceivers. The supported transceiver types are listed below: 400GBASE DAC, AOC, SR8, DR4, and FR4 The OCP module includes 2 QSFP28 slots for 100 Gbps QSFP28 transceivers. The supported transceiver types are listed below: 100GBASE CR4, AOC, SR4, LR4, and PSM4 ...
Chapter 3 | Device Connections Connecting to QSFP-DD/QSFP28 Ports Connecting to Fiber Follow these steps to connect cables to QSFP-DD/QSFP28 transceiver ports. Optic Ports Warning: This device uses transceivers with lasers to transmit signals over fiber optic cable. The transceivers are compliant with the requirements of a Class 1 Laser Product and are inherently eye safe in normal operation.
Chapter 3 | Device Connections Connecting to the VGA Ports Connecting to the VGA Ports The DE-15 VGA ports on the CPU module’s panel and on the device front panel are for connecting a VGA monitor to the system. The USB ports can also be used for keyboard and mouse connections to the server.
Chapter 3 | Device Connections Connecting to the Console Port Parity—None Stop bit—One Data bits—8 Flow control—none Follow these steps to connect to the console port: Attach one end of the console cable to the DE-9 COM port connector on a management PC.
Chapter 3 | Device Connections Connecting to the 1000BASE-T MGMT Port Connecting to the 1000BASE-T MGMT Port The RJ-45 10/100/1000BASE-T MGMT port on the CPU module supports an out- of-band (OOB) network connection to any other network device. The connection requires unshielded twisted-pair (UTP) or shielded twisted-pair (STP) cables with RJ- 45 connectors at both ends.
Troubleshooting When possible, before checking specific troubleshooting options, always look for POST messages by first rebooting the device using one of these methods: Reset the system through OS software. Power down by pressing the power on/off button. Remove power to the unit. ...
Chapter 4 | Troubleshooting Cooling and Fans Cooling and Fans If the system is running hot, check these items: Check to be sure the ambient temperature is not too high. Make sure all fans are running properly. Check the fan settings in the BIOS. The fans might need to run at a higher ...
Safety and Regulatory Information FCC Class A This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment.
Chapter 5 | Safety and Regulatory Information Immunity to conducted disturbances, Induced by radio- frequency fields: IEC 61000-4-6 Power frequency magnetic field immunity test according to IEC 61000-4-8 Voltage dips, short interruptions and voltage variations immunity test according to IEC 61000-4-11 EN 62368-1:2014/A11: 2017 LVD: ...
Chapter 5 | Safety and Regulatory Information Warnhinweis: Faseroptikanschlüsse - Optische Sicherheit: Die Laser entsprechen den Anforderungen eines Laserprodukts der Klasse 1 und LASERPRODUKT sind im Normalbetrieb grundsätzlich augensicher. Allerdings sollten Sie niemals DER KLASSE 1 direkt in das von einem Transceiver ausgesendete Laserlicht blicken. 警告:光纤端口安全...
Page 47
Chapter 5 | Safety and Regulatory Information The appliance coupler (the connector to the unit and not the wall plug) must have a configuration for mating with an EN 60320/IEC 320 appliance inlet. The socket outlet must be near to the unit and easily accessible. You can only ...
Page 48
Chapter 5 | Safety and Regulatory Information Avertissement: L’installation et la dépose de ce groupe doivent être confiés à un personnel qualifié. Ne branchez pas votre appareil sur une prise secteur (alimentation électrique) lorsqu'il n'y a pas de connexion de mise à la terre (mise à la masse). Vous devez raccorder ce groupe à...
Page 49
Chapter 5 | Safety and Regulatory Information Bitte unbedingt vor dem Einbauen das Gerät die folgenden Sicherheitsanweisungen durchlesen: Warnung: Die Installation und der Ausbau des Geräts darf nur durch Fachpersonal erfolgen. Das Gerät sollte nicht an eine ungeerdete Wechselstromsteckdose angeschlossen werden.
Chapter 5 | Safety and Regulatory Information Warnings and Cautionary Messages Warning: This product does not contain any serviceable user parts. Warning: Installation and removal of the unit must be carried out by qualified personnel only. Warning: When connecting this device to a power outlet, connect the field ground lead on the tri-pole power plug to a valid earth ground line to prevent electrical hazards.
Page 51
Chapter 5 | Safety and Regulatory Information – 51 –...
Need help?
Do you have a question about the AGS8200 and is the answer not in the manual?
Questions and answers