Table of Contents

Advertisement

HPE Apollo sx40 Server
Maintenance and Service Guide
Abstract:
This document provides an overview of the HPE Apollo sx40 server hardware and
information and procedures used to maintain and service it.
Part Number: P05957-002
Published: April 2019
Edition: 2

Advertisement

Table of Contents
loading

Summary of Contents for HPE Apollo sx40

  • Page 1 HPE Apollo sx40 Server Maintenance and Service Guide Abstract: This document provides an overview of the HPE Apollo sx40 server hardware and information and procedures used to maintain and service it. Part Number: P05957-002 Published: April 2019 Edition: 2...
  • Page 2 Copyright 2019 Hewlett Packard Enterprise Development LP © Notices The information contained herein is subject to change without notice. The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained herein.
  • Page 3 Record of Revision Version Description -001 February 2019 Original printing -002 April 2019 Added PCIe accelerator part numbers and updated GPU module replacement procedure  P05957-002 Record of Revision...
  • Page 4  Record of Revision P05957-002...
  • Page 5: Table Of Contents

    Contents Hardware overview..................1-1 Introduction....................1-1 Front view ...................... 1-2 Rear view ....................... 1-4 Top view ......................1-5 Block diagram....................1-6 1.5.1 Motherboard................. 1-7 Processors ...................... 1-9 Memory DIMMs.................... 1-9 GPUs......................1-11 Power ......................1-13 1.10 Cooling......................1-14 Customer self repair ..................2-1 Parts catalog ....................3-1 Server spare parts...................
  • Page 6 4.4.8 GPU module replacement ............4-15 4.4.9 I/O port module (third PCIe riser) replacement......4-16 4.4.10 Memory DIMM replacement............. 4-17 4.4.11 Power supply replacement ............4-18 4.4.12 Processor replacement ............... 4-19 4.4.13 Motherboard replacement ............4-21 4.4.14 PCIe card replacement ............... 4-22 4.4.15 PCIe riser replacement...............
  • Page 7 Remote support ....................9-2 Warranty information ..................9-3 Regulatory information.................. 9-3 Documentation feedback ................9-4  P05957-002 Contents...
  • Page 8  viii Contents P05957-002...
  • Page 9: Hardware Overview

    This chapter provides an overview of the hardware used in an HPE Apollo sx40 server. Introduction The HPE Apollo sx40 server is a 1U chassis with two Intel processors that supports four NVIDIA Pascal or Volta SXM2 GPUs. Figure 1-1 HPE Apollo sx40 server ...
  • Page 10: Front View

    Front view 1. HPE Apollo sx40 chassis (1U) 4. Unit Identification (UID) LED/button 2. Two SFF hot-swap SATA drive bays 5. Power button 3. Internal fans 6. Status LED Figure 1-2 Front view Figure 1-3 shows the control panel in more detail.
  • Page 11 Table 1-1 Control panel features Item Feature Description Information LED Provides system status: Continuously on and red: An overheat condition has occurred. (This might be caused by cable congestion.) Blinking red: Fan failure, check for an inoperative fan. Solid blue: Local UID has been activated. Use this function to locate the server in a rackmount environment.
  • Page 12: Rear View

    Rear view 1. Two 2000W Titanium level power supplies 5. Two USB 3.0 connectors 2. Embedded 1 Gb NIC 1 6. Dedicated IPMI LAN port 3. Embedded 1 Gb NIC 2 7. Two full height PCIe Gen3 x16 slots 4. Two half height PCIe Gen3 x16 slots Figure 1-4 Rear view ...
  • Page 13: Top View

    Top view 1. Processor 1 4. Processor 2 2. DIMMs for processor 1 5. DIMMs for processor 2 3. NVIDIA SXM2 GPUs Figure 1-5 Top view  P05957-002 Hardware overview...
  • Page 14: Block Diagram

    Block diagram Figure 1-6 shows the block diagram. #F-1 #M-1 #E-1 #L-1 VCCP0 12v VCCP1 12v #D-1 #K-1 VR13 VR13 #C-1 #J-1 5+1 PHASE 5+1 PHASE UPI 10.4G /11.2G #B-1 #H-1 205W 205W #A-1 #G-1 PECI:30 PECI:31 SOCKET ID:0 SOCKET ID:1 DMI3 DMI3 PCI-E X16 G3 (RSC-G-6)
  • Page 15: Motherboard

    1.5.1 Motherboard The motherboard uses the Intel PCH C612 chipset. Figure 1-7 shows the motherboard layout and components. JBT1 JTPM1 JSDCARD1 JPSU2 JPSU1 X11DGQ REV:1.00 DESIGNED IN USA FAN9 FAN8 HDD_PWR1 HDD_PWR2 JPW3 JPW2 JPW1 Figure 1-7 Motherboard layout  P05957-002 Hardware overview...
  • Page 16 Figure 1-8 describes the motherboard jumpers, connectors, and LEDs. Jumper Description Default Setting Connectors Description Description State Status Figure 1-8 Motherboard connectors, jumpers, and LEDs  Hardware overview P05957-002...
  • Page 17: Processors

    Processors The system supports two Intel Xeon processors from the following processor families: • Intel Xeon Bronze 3100 series • Intel Xeon Silver 4100 series • Intel Xeon Gold 5100, 6100, and 8100 series See Chapter 3, “Parts catalog,” for a listing of specific processors that are supported. Processor 2 Processor 1 Figure 1-9...
  • Page 18 Figure 1-10 DIMM slot numbering  1-10 Hardware overview P05957-002...
  • Page 19: Gpus

    GPUs The system supports up to four Pascal or Volta SXM2 GPUs running over NVLINK connections. The SXM2 GPUs attach to the GPU interface board (SXM2 add-on module) in the front section of the chassis. SXM2 GPUs Figure 1-11 SXM2 GPUs The following SXM2 GPU accelerators are supported: •...
  • Page 20 Figure 1-12 shows the GPU block diagram. Figure 1-12 GPU block diagram  1-12 Hardware overview P05957-002...
  • Page 21: Power

    Two hot-swappable 2000-W power supplies supply power to the system. The power supplies have an 80 Plus Titanium level rating. The HPE Apollo sx40 server is only considered to be N+1 in the 200-240V range; the Note: 100-127V range requires both power supplies to be operating.
  • Page 22: Cooling

    1.10 Cooling There are seven fans located at the front of the system. Airflow is from the front to the back of the chassis. Fans Figure 1-14 Fans  1-14 Hardware overview P05957-002...
  • Page 23: Customer Self Repair

    Chapter 2 Customer self repair Hewlett Packard Enterprise products are designed with many Customer Self Repair (CSR) parts to minimize repair time and allow for greater flexibility in performing defective parts replacement. If during the diagnosis period Hewlett Packard Enterprise (or Hewlett Packard Enterprise service providers or service partners) identifies that the repair can be accomplished by the use of a CSR part, Hewlett Packard Enterprise will ship that part directly to you for replacement.
  • Page 24 Parts only warranty service Your Hewlett Packard Enterprise Limited Warranty may include a parts only warranty service. Under the terms of parts only warranty service, Hewlett Packard Enterprise will provide replacement parts free of charge. For parts only warranty service, CSR part replacement is mandatory. If you request Hewlett Packard Enterprise to replace these parts, you will be charged for the travel and labor costs of this service ...
  • Page 25: Parts Catalog

    Hewlett Packard Enterprise PartSurfer website (http://www.hpe.com/info/partssurfer). Table 3-1 Server spare parts Component HPE spare part number Description CSR? HPE Apollo sx40 4x P100 SXM2 GPU CTO Server (Q5S69A) Cables P0001680-001 55cm 30AWG SATA S-S cable P00239-001 GPU 5.5CM,2x4F/RA/CPUto2x4F/RA/CPU,P4.2 P00240-001 GPU 30CM,2x4F/CPUto2x4F/CPU,P4.2...
  • Page 26 HPE spare part number Description CSR? Power P0005138-001 1U 2000W Redundant Titanium Power Supply W/PMbus 73.5x40x265mm,RoHS/REACH (Q9B47A) HPE Apollo sx40 2SFF 4x V100 SXM2 CTO Server Cables P0001680-001 55cm 30AWG SATA S-S cable P00239-001 GPU 5.5CM,2x4F/RA/CPUto2x4F/RA/CPU,P4.2 P00240-001 GPU 30CM,2x4F/CPUto2x4F/CPU,P4.2 P0005127-001...
  • Page 27: Options Spare Parts

    Table 3-1 (continued) Server spare parts Component HPE spare part number Description CSR? P05912-001 SPS-PCA, HPE NVIDIA Tesla V100 SXM2 32GB Optional P0005132-001 GPU Bridge Board connecting X10DGQ to AOM-SXM2 P03782-001 Quad Volta SXM 2.0 modules with Rev 2.00 board Optional...
  • Page 28 Table 3-3 (continued) Hard drives Product SKU HPE spare part number Description CSR? P03594-B21 P04111-001 SPS-DRV SSD 240GB SFF SATA MU RW DS P03596-B21 P04112-001 SPS-DRV SSD 480GB SFF SATA MU RW DS P03598-B21 P04113-001 SPS-DRV SSD 960GB SFF SATA MU RW DS...
  • Page 29 Table 3-8 PCIe NVMe Product SKU HPE spare part number Description CSR? 877831-B21 880418-001 SPS-HPE 4TB PCIe x4 RI HH DS Card 877825-B21 879772-001 SPS-1.6TB PCIe x8 MU HH DS Card 877827-B21 879773-001 SPS-3.2TB PCIe x8 MU HH DS Card 877829-B21 879774-001 SPS-6.4TB PCIe x8 MU HH DS Card...
  • Page 30 Table 3-9 Processors HPE spare part number Description CSR? Q5S81A 874727-001 SPS-CPU SKL Xeon-P 8176 28c 2.1G 165W Q5S82A 874728-001 SPS-CPU SKL Xeon-P 8170 26c 2.1G 165W Q5S83A 875729-001 SPS-CPU SKL Xeon-P 8164 26c 150W Q5S84A 874729-001 SPS-CPU SKL Xeon-P 8160 24C 2.1G 150W...
  • Page 31  P05957-002 Parts catalog...
  • Page 32  Parts catalog P05957-002...
  • Page 33: Part Replacement Procedures

    Be sure to observe all ESD, electrical, and safety precautions when you perform the part replacement procedures. Teardown video Before performing any part replacement, view the system teardown video to acquaint yourself with part location and removal. This video is included in the HPE Apollo sx40 Gen10 Server online training.  P05957-002...
  • Page 34: Accessing Internal Components

    Accessing internal components The chassis has two covers that provide access to internal components. Remove the appropriate cover to access an internal component. 4.3.1 Removing the front top cover To access components under the front cover: Remove the two screws in the middle of the chassis. ...
  • Page 35 Loosen the two screws on the front of the chassis. Push the cover towards the front of the chassis and lift the cover off of the chassis.  P05957-002 Part replacement procedures...
  • Page 36: Removing The Center Brace

    4.3.2 Removing the center brace To remove the center brace: Remove the front top cover. Remove the two screws that secure the center brace. Lift the brace out of the chassis.  Part replacement procedures P05957-002...
  • Page 37: Removing The Rear Top Cover

    4.3.3 Removing the rear top cover To access components under the rear cover: Remove the two screws in the middle of the chassis.  P05957-002 Part replacement procedures...
  • Page 38 Loosen the two screws on the rear of the chassis. Slide the cover towards the rear of the chassis, and lift the cover off of the chassis.  Part replacement procedures P05957-002...
  • Page 39: Part Replacement Procedures

    Part replacement procedures 4.4.1 Bridge board replacement To replace a bridge board: Power down the system. Disconnect the system from power. Remove the front top cover. Remove the center brace. Loosen the screw that secures the bridge board. Lift the bridge board out of the chassis. Remove the bridge board from the metal bracket.
  • Page 40: Control Panel Replacement

    4.4.2 Control panel replacement To replace the control panel: Power down the system. Disconnect the system from power. Remove the front top cover. Remove the foam block next to the control panel. Remove the fan next to the control panel. Disconnect the cable from the control panel.
  • Page 41: Disk Drive Replacement (External Drive)

    10. Secure the control panel with three screws. 11. Connect the cable to the control panel. 12. Install the fan next the control panel. 13. Install the foam block over the control panel. 14. Install the front top cover. 15. Connect the system to site power. 16.
  • Page 42: Disk Drive Replacement (Internal Drive)

    4.4.4 Disk drive replacement (internal drive) To replace an internal drive: Power down the system. Disconnect the system from power. Remove the front top cover. Disconnect cables from the SATA interface board. Disconnect the cables from both drives in the internal drive pair. ...
  • Page 43 Loosen the screw that secures the drive pair carrier. Lift the drive pair carrier out of the chassis. Remove the failing drive from the drive pair carrier. Install the replacement drive in the drive pair carrier. 10. Insert the drive pair carrier into the chassis. 11.
  • Page 44: Fan Replacement

    4.4.5 Fan replacement To replace a fan: Power down the system. Disconnect the system from power. Remove the front top cover. Disconnect the fan cable from the fan control board. Lift the fan out of the chassis. Insert the replacement fan into the chassis. Connect the fan cable to the fan control board.
  • Page 45: Gpu Interface Board Replacement

    Lift the fan control board out of the chassis. Insert the replacement fan control board. 10. Secure the fan control board with screws. 11. Connect the fan cables to the fan control board. 12. Connect the other cables to the fan control board. 13.
  • Page 46 Remove the 11 screws that secure the GPU interface board. Remove the GPU interface board. Caution: The GPU connectors on the GPU interface board are very delicate. Be careful not to damage them on the existing board or the replacement board. 10.
  • Page 47: Gpu Module Replacement

    4.4.8 GPU module replacement To replace a GPU module: Power down the system. Disconnect the system from power. Remove the front top cover. Loosen the four screws that secure the GPU module’s heatsink. Loosen the screws in an “X” pattern. Lift the heatsink out of the chassis.
  • Page 48: I/O Port Module (Third Pcie Riser) Replacement

    12. Apply new TIM to the GPU module. Use one tube of TIM and apply four dots and an “X” pattern. Caution: Verify that there is no debris or particle contamination in the TIM application. Debris or particle contamination can cause damage to the GPU. If there is debris or particle contamination, clean off the TIM and apply fresh TIM.
  • Page 49: 4.4.10 Memory Dimm Replacement

    To replace the third PCIe riser or the I/O port module: Power down the system. Disconnect the system from power. Remove the rear top cover. Disconnect any cables from the I/O ports. Disconnect the VGA cable from the third PCIe riser. Lift the third PCIe riser out of the chassis.
  • Page 50: 4.4.11 Power Supply Replacement

    Push down on the DIMM until it locks in place. Install the rear top cover. Connect power to the system. 10. Power up the system. 4.4.11 Power supply replacement To replace a power supply: Disconnect the power cord from site power. Disconnect the power cord from the power supply.
  • Page 51: 4.4.12 Processor Replacement

    4.4.12 Processor replacement To replace a processor: Power down the system. Disconnect the system from power. Remove the rear top cover. Loosen the screws in the heatsink in the order shown on the heatsink. Lift the heatsink and processor out of the chassis. Pull the heatsink off of the processor and clean the heatsink.
  • Page 52 Align pin one on the processor (indicated by the gold triangle) under the number “1” on the heatsink. 10. Attach the replacement processor to the heatsink. 11. Inspect the processor socket for any debris or bent pins. Remove any debris and fix any bent pins before installing a processor in the socket.
  • Page 53: 4.4.13 Motherboard Replacement

    13. Tighten the screws on the heatsink in the order shown on the heatsink. 14. Install the rear top cover. 15. Connect power to the system. 16. Power up the system. 4.4.13 Motherboard replacement To replace a motherboard: Power down the system. Disconnect the system from power.
  • Page 54: 4.4.14 Pcie Card Replacement

    Remove the jackpost and 10 screws that secure the motherboard. (Two screws secure the air shroud for the power supplies.) 10. Lift the motherboard out of the chassis. 11. Insert the replacement motherboard and secure it with the screws and jackpost. 12.
  • Page 55: 4.4.15 Pcie Riser Replacement

    4.4.15 PCIe riser replacement This procedure works for the first and second PCIe risers. To replace the third PCIe riser (which also contains the I/O port module), refer to “I/O port module (third PCIe riser) replacement” on page 4-16. To replace a PCIe riser: Power down the system.
  • Page 56 Remove the three screws that secure the PCIe riser to the bracket. 10. Remove the PCIe riser from the metal bracket. 11. Install the metal bracket on the replacement PCIe riser. 12. Secure the PCIe riser to the metal bracket with three screws. 13.
  • Page 57: 4.4.16 Sata Interface Board Replacement

    4.4.16 SATA interface board replacement To replace the SATA interface board: Power down the system. Disconnect the system from power. Remove the front top cover. Disengage any external drives from the SATA interface board. Disconnect cables from the SATA interface board. Remove the internal drive pair.
  • Page 58  4-26 Part replacement procedures P05957-002...
  • Page 59: Troubleshooting

    Chapter 5 Troubleshooting This chapter provides guidance for troubleshooting HPE Apollo sx40 server failures. No power Make sure that no short circuits exist between the motherboard and the chassis. Verify that all jumpers are set to their default positions. Turn the power switch on and off to test the system.
  • Page 60: Bios Beep Codes

    BIOS beep codes Table 5-1 BIOS error beep (POST) codes Beep code Error message Description 1 short Refresh Circuits have been reset (ready to power up) 5 short, 1 long Memory error No memory detected in the system 5 long, 2 short Display memory Video adapter is missing or has faulty memory read/write error...
  • Page 61: System Loses Its Setup Configuration

    System loses its setup configuration If the system becomes unstable during or after OS installation, check the following: • CPU/BIOS support: Make sure that your CPU is supported and that you have the latest BIOS installed in your system. • Memory support: Make sure that the memory modules are supported by testing the modules using or a similar utility.
  • Page 62  Troubleshooting P05957-002...
  • Page 63: Firmware

    Chapter 6 Firmware This chapter describes how to update the firmware used in an HPE Apollo sx40 server. About firmware There are two main types of firmware in the server: • BIOS • Note: Other add-on components (for example, HCA cards) can also contain firmware. Refer to an add-on component’s product documentation for information about flashing this...
  • Page 64: Using The Sumtool To Flash The Bios

    Access the USB filesystem. (For example, if the USB filesystem is listed as FS0, type FS0: access the USB filesystem.) Change to the directory where the BIOS file is stored on the thumb drive. Run the script: flash.nsh flash.nsh <bios_filename> When flashing is complete, AC power cycle the server.
  • Page 65: Flashing The Bmc

    Flashing the BMC There are three methods to flash the BMC: • Linux command AlUpdate • SUMTool • Web GUI 6.3.1 Using a Linux command to flash the BMC Use the command to flash the BMC from the Linux operating system: AlUpdate AlUpdate -f <bmc_image_file>...
  • Page 66: Using The Sumtool To Flash The Bmc

    6.3.2 Using the SUMTool to flash the BMC Use the command to use SUMTool to flash the BMC. In-band: Upload the new BMC file to the server. Run the command to flash the firmware: # sum -c UpdateBmc --file <bmc_filename> --overwrite_cfg --overwrite_sdr When flashing is complete, AC power cycle the server.
  • Page 67 Click Enter Update Mode. A warning message appears. Click Yes to enter update mode. Click Browse.  P05957-002 Firmware...
  • Page 68 Select the BMC firmware file Click Upload Firmware. Uncheck all three Preserve check boxes. These must be unchecked for the update to be successful (and to return to the default settings). Click Start Upgrade. Warning: To properly update the firmware, do not interrupt the upgrade process. Once the process completes, the system automatically reboots.
  • Page 69: Specifications

    Chapter 7 Specifications Physical specifications Table 7-1 Physical specifications Feature Description Dimensions (L x W x D) 4.3 x 43.7 x 89.4 cm (1.7 x 17.2 x 35.2 in) Weight (Approximate) 21.8 kg (58 lb) (Two hard drives and two processors installed) Environmental specifications Table 7-2 Environmental specifications...
  • Page 70 Table 7-2 (continued) Environmental specifications Feature Description Extended ambient operating System performance during standard operating support may be support reduced if operating with a fan fault or above 30°C (86°F)  For approved hardware configurations, the supported system inlet range is extended to be: 5° to 10°C (41° to 50°F) and 35° to 40°C (95°...
  • Page 71: Websites

    Chapter 8 Websites General websites  Hewlett Packard Enterprise Information Library  www.hpe.com/info/EIL  Single Point of Connectivity Knowledge (SPOCK) Storage compatibility matrix www.hpe.com/storage/spock  Storage white papers and analyst reports  www.hpe.com/storage/whitepapers  For additional websites, see Chapter 11, “Support and other resources.”...
  • Page 72  Websites P05957-002...
  • Page 73: Support And Other Resources

    Support and other resources Accessing Hewlett Packard Enterprise Support • For live assistance, go to the Contact Hewlett Packard Enterprise Worldwide website: http://www.hpe.com/assistance • To access documentation and support services, go to the Hewlett Packard Enterprise Support Center website: http://www.hpe.com/support/hpesc 9.1.1...
  • Page 74: Customer Self Repair

    Note: Access to some updates might require product entitlement when accessed through the Hewlett Packard Enterprise Support Center. You must have an HPE Passport set up with relevant entitlements. Customer self repair Hewlett Packard Enterprise customer self repair (CSR) programs allow you to repair your product.
  • Page 75: Warranty Information

    HPE Proactive Care service: Supported products list www.hpe.com/services/proactivecaresupportedproducts HPE Proactive Care advanced service: Supported products list www.hpe.com/services/proactivecareadvancedsupportedproducts Proactive Care customer information Proactive Care central www.hpe.com/services/proactivecarecentral Proactive Care service activation www.hpe.com/services/proactivecarecentralgetstarted Warranty information To view the warranty for your product or to view the Safety and Compliance Information for...
  • Page 76: Documentation Feedback

    Hewlett Packard Enterprise is committed to providing documentation that meets your needs. To help us improve the documentation, send any errors, suggestions, or comments to Documentation Feedback (docsfeedback@hpe.com). When submitting your feedback, include the document title, part number, edition, and publication date located on the front cover of the document. For online help content, include the product name, product version, help edition, and publication date located on the legal notices page.

Table of Contents