Nvidia ConnectX-5 User Manual
Nvidia ConnectX-5 User Manual

Nvidia ConnectX-5 User Manual

Infiniband/vpi adapter cards for ocp spec 2.0
Hide thumbs Also See for ConnectX-5:
Table of Contents

Advertisement

Quick Links

 
 
 
 
 
 
 
NVIDIA ConnectX-5 InfiniBand/VPI Adapter
Cards for OCP Spec 2.0 User Manual
 
 
Exported on Sep/19/2022 09:37 AM

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the ConnectX-5 and is the answer not in the manual?

Questions and answers

Summary of Contents for Nvidia ConnectX-5

  • Page 1               NVIDIA ConnectX-5 InfiniBand/VPI Adapter Cards for OCP Spec 2.0 User Manual     Exported on Sep/19/2022 09:37 AM...
  • Page 2: Table Of Contents

    Table of Contents Introduction..................7 Products Overview ...................7 OCP Spec 2.0 Stacking Heights..............8 OCP Spec 2.0 Type 1 Stacking Height ............8 OCP Spec 2.0 Type 2 Stacking Height ............8 Features and Benefits ................8 Multi-Host Technology................10 Operating Systems/Distributions ..............11 Connectivity ..................
  • Page 3 UEFI Secure Boot................45 Performance Tuning ................47 VMware Driver Installation ............... 47 Hardware and Software Requirements ............. 47 Installing NVIDIA NATIVE ESXi Driver for VMware vSphere ....... 47 Removing Earlier NVIDIA Drivers ............. 48 Firmware Programming ............... 48 Updating Adapter Firmware ..............50 Troubleshooting ................51...
  • Page 4 Board Mechanical Drawing and Dimensions ............ 57 Finding the GUID/MAC and Serial Number on the Adapter Card.......61 MCX545A-ECAN Board Labels (Example)............61 MCX545B-ECAN Board Labels (Example)............62 MCX545M-ECAN Board Labels (Example) ............62 Document Revision History ..............63  ...
  • Page 5 Compute Project (OCP), Spec 2.0. It provides details as to the interfaces of the board, specifications, required software and firmware for operating the board, and relevant documentation. Ordering Part Numbers The table below provides the ordering part numbers (OPN) for the available ConnectX-5 VPI adapter cards for OCP Spec 2.0. Marketing Description Model...
  • Page 6 NVIDIA ConnectX® NATIVE ESXi stack. See VMware® ESXi Drivers Documentation. NVIDIA Firmware Utility (mlxup) User NVIDIA firmware update and query utility used to update the Manual and Release Notes firmware. Refer to Firmware Utility (mlxup) Documentation. NVIDIA Firmware Tools (MFT) User Manual User Manual describing the set of MFT firmware management tools for a single node.
  • Page 7: Introduction

    PCI Express Gen 3.0/4.0 servers used in Enterprise Data Centers and High-Performance Computing environments. The following provides the ordering part number, port speed, number of ports, and PCI Express speed.  Products Overview Model ConnectX-5 VPI Cards for OCP Spec 2.0 Part Number MCX545A-ECAN MCX545B-ECAN MCX545M-ECAN MCX546A-EDAN OCP Spec 2.0 Type...
  • Page 8: Ocp Spec 2.0 Stacking Heights

    OCP Spec 2.0 Stacking Heights OCP Spec 2.0 Type 1 Stacking Height   This section applies to MCX542A-ACAN. The ingle-port 100Gb/s adapter card comply with OCP Spec 2.0 Type 1 with 8mm stacking height. OCP Spec 2.0 Type 2 Stacking Height  ...
  • Page 9 NVGRE and VXLAN. While this solves network scalability issues, it hides the TCP packet from the hardware offloading engines, placing higher loads on the host CPU. ConnectX-5 effectively addresses this by providing advanced NVGRE and VXLAN hardware offloading engines that encapsulate and de-capsulate the overlay protocol.
  • Page 10: Multi-Host Technology

    PCIe connections to each of the four CPUs in the server. The ConnectX-5 PCIe x16 interface is separated into four independent PCIe x4 interfaces. Each interface is connected to a separate host with no performance degradation.
  • Page 11: Operating Systems/Distributions

    receive network traffic independently without the need to send network data to other CPUs using QPI bus. Operating Systems/Distributions  In MCX545M-ECAN, only OFED is supported. • RHEL/CentOS • Windows • FreeBSD • VMware • OpenFabrics Enterprise Distribution (OFED) • OpenFabrics Windows Distribution (WinOF-2) Connectivity •...
  • Page 12: Interfaces

     The adapter card includes special circuits to protect from ESD shocks to the card/server when plugging copper cables. PCI Express Interface The table below describes the supported PCIe interface in ConnectX-5 adapter cards. ConnectX-5 IC Supported PCIe Interface Features ConnectX-5 Ex PCIe Gen 3.0/4.0 (1.1 and 2.0 compatible)
  • Page 13: Heat Sink Interface

    Heat Sink Interface A heatsink is attached to the ConnectX-5 IC in order to dissipate the heat from the ConnectX- 5 IC. It is attached either by using four spring-loaded push pins that insert into four mounting holes or by screws.
  • Page 15: Hardware Installation

    Hardware Installation Installation and initialization of ConnectX-5 adapter cards for OCP Spec 2.0 require attention to the mechanical attributes, power specification, and precautions for electronic equipment. Safety Warnings  Safety warnings are provided here in the English language. For safety warnings in other...
  • Page 16: Installation Procedure Overview

    CLASS 1 LASER PRODUCT and reference to the most recent laser standards: IEC 60 825-1:1993 + A1:1997 + A2:2001 and EN 60825-1:1994+A1:1996+ A2:20 Installation Procedure Overview The installation procedure of ConnectX-5 adapter cards for OCP Spec 2.0 involves the following steps: Direct Link Procedure Check the system’s hardware and software...
  • Page 17: Airflow Requirements

    (http:// www.opencompute.org/wiki/Server/SpecsAndDesigns). Airflow Requirements  ConnectX-5 adapter cards are offered with two airflow patterns: from the heatsink to the network ports, and vice versa, as shown below. Please refer to the TBD chapter for airflow numbers for each specific card model.
  • Page 18: Adapter Card Installation Instructions

    Shut down your system if active: Turn off the power to the system, and disconnect the power cord. Refer to the system documentation for instructions. Before you install the ConnectX-5 card, make sure that the system is disconnected from power.
  • Page 19: Cables And Modules

    PCI Express slot until firmly seated. Secure the adapter with the adapter clip or screw. To uninstall the adapter card, see Uninstalling the Card. Cables and Modules  To obtain the list of supported NVIDIA cables for your adapter, please refer to the Cables Reference Table at http://www.nvidia.com/products/interconnect/cables-configurator.php.
  • Page 20: Identifying The Card In Your System

    Get the device location on the PCI bus by running lspci and locating lines with the string “Mellanox Technologies”: lspci |grep -i Mellanox Network controller: Mellanox Technologies MT28800 Family [ConnectX-5] On Windows Open Device Manager on the server. Click Start => Run, and then enter devmgmt.msc.
  • Page 21: Adapter Cards Extraction Instructions

    In the Value display box, check the fields VEN and DEV (fields are separated by ‘&’). In the display example above, notice the sub-string “PCI\VEN_15B3&DEV_1003”: VEN is equal to 0x15B3 – this is the Vendor ID of NVIDIA; and DEV is equal to 1018 (for ConnectX-5) – this is a valid NVIDIA PCI Device ID.
  • Page 22: Card Extraction

    Remove any metallic objects from your hands and wrists. It is strongly recommended to use an ESD strap or other antistatic devices. Turn off the system and disconnect the power cord from the server. Card Extraction  Please note that the following images are for illustration purposes only. Verify that the system is powered off and unplugged.
  • Page 23 6. To remove the card, gently pull the adapter card upwards.
  • Page 24: Driver Installation

    VMware Driver Installation Windows Driver Installation For Windows, download and install the latest NVIDIA WinOF-2 for Windows software package available via the NVIDIA web site at: http://www.nvidia.com > Products > Software > Ethernet Drivers > Download. Follow the installation instructions included in the download package (also available from the download page).
  • Page 25: Installing Nvidia Winof-2 Driver

    Both Attended and Unattended installations require administrator privileges.  WinOF-2 supports adapter cards based on the NVIDIA ConnectX®-4 and above family of adapter IC devices only. If you have ConnectX-3 and ConnectX-3 Pro on your server, you will need to install WinOF driver. For details on how to install WinOF driver, please refer to WinOF User Manual.
  • Page 26 Click Next in the Welcome screen. Read then accept the license agreement and click Next. Select the target folder for the installation. The firmware upgrade screen will be displayed in the following cases: If the user has an OEM card. In this case, the firmware will not be displayed.
  • Page 27 If the user has a standard NVIDIA card with an older firmware version, the firmware will be updated accordingly. However, if the user has both an OEM card and a NVIDIA card, only the NVIDIA card will be updated. Select a Complete or Custom installation, follow Step a and on.
  • Page 28 Diagnostic Tools - installation tools used for diagnostics, such as mlx5cmd Click Next to install the desired tools. Click Install to start the installation. In case that firmware upgrade option was checked in Step 7, you will be notified if a firmware upgrade is required. See TBD.
  • Page 29 Click Finish to complete the installation. Unattended Installation  If no reboot options are specified, the installer restarts the computer whenever necessary without displaying any prompt or warning to the user. Use the /norestart or /forcerestart standard command-line options to control reboots. The following is an example of an unattended installation session.
  • Page 30: Installation Results

    Device Manager. Upon installation completion, the inf files can be located at: %ProgramFiles% \Mellanox\MLNX_WinOF2\Drivers\<OS> To see the NVIDIA network adapters, display the Device Manager and pull down the Network adapters menu. Extracting Files Without Running Installation To extract the files without running installation, perform the following steps.
  • Page 31 Click Next to create a server image. Click Change and specify the location in which the files are extracted to. Click Install to extract this folder, or click Change to install to a different folder.
  • Page 32: Uninstalling Nvidia Winof-2 Driver

    /v"/qn" Firmware Upgrade If the machine has a standard NVIDIA card with an older firmware version, the firmware will be automatically updated as part of the WinOF-2 package installation. For information on how to upgrade firmware manually, please refer to the MFT User Manual...
  • Page 33: Deploying The Driver On A Nano Server

    Deploying the Driver on a Nano Server Offline Installation To deploy the Driver on a Nano Server: Go to the NVIDIA WinOF web page at http://www.nvidia.com > Products > Ethernet Drivers > Windows SW/Drivers. Download the driver (MLNX_WinOF2_MLNX_WinOF2-1_64_mlx5_All_win2016_x64_fre_1_64_15407.exe). Extract the driver to a local directory (see Extracting Files Without...
  • Page 34: Linux Driver Installation

    To upgrade, it is recommended to run a script that will execute all the required upgrade commands. Linux Driver Installation This section describes how to install and test the NVIDIA OFED for Linux package on a single server with a NVIDIA ConnectX-5 adapter card installed. Prerequisites...
  • Page 35: Installing Nvidia Ofed

    You will be prompted to acknowledge the deletion of the old packages. • If you need to install NVIDIA OFED on an entire (homogeneous) cluster, a common strategy is to mount the ISO image on one of the cluster nodes and then copy it to a shared file system...
  • Page 36 RAM. For your machine to be part of the InfiniBand/VPI fabric, a Subnet Manager must be running on one of the fabric nodes. At this point, NVIDIA OFED for Linux has already installed the OpenSM Subnet Manager on your machine.
  • Page 37 Device #1: ---------- Device Type: ConnectX-5 Part Number: MCX545M-ECAN Description: ConnectX®-5 VPI network interface card for OCP with Multi-Host, , with host management, EDR IB (100Gb/s) and 100GbE, single-port QSFP28, PCIe3.0 x16, no bracket.
  • Page 38 /etc/infiniband/info. Most of the NVIDIA OFED components can be configured or reconfigured after the installation, by modifying the relevant configuration files. See the relevant chapters in this manual for details.
  • Page 39 OPENIBD_PRE_START OPENIBD_POST_START OPENIBD_PRE_STOP OPENIBD_POST_STOP Example: OPENIBD_POST_START=/sbin/openibd_post_start.sh  An example of OPENIBD_POST_START script for activating all interfaces is provided in the MLNX_OFED package under the docs/scripts/openibd-post-start-configure-interfaces/ folder. Driver Load Upon System Boot Upon system boot, the NVIDIA drivers will be loaded automatically.
  • Page 40: Installing Mlnx_Ofed Using Yum

    Failed to start the mst driver Uninstalling MLNX_OFED Use the script /usr/sbin/ofed_uninstall.sh to uninstall the NVIDIA OFED package. The script is part of the ofed-scripts RPM. Installing MLNX_OFED Using YUM This type of installation is applicable to RedHat/OL, Fedora, XenServer Operating Systems.
  • Page 41 # mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Download and install NVIDIA GPG-KEY: The key can be downloaded via the following link: http://www.nvidia.com/downloads/ofed/ RPM-GPG-KEY-Mellanox # wget http://www.nvidia.com/downloads/ofed/RPM-GPG-KEY-Mellanox --2014-04-20 13:52:30-- http://www.nvidia.com/downloads/ofed/RPM-GPG-KEY-Mellanox Resolving www.nvidia.com... 72.3.194.0 Connecting to www.nvidia.com|72.3.194.0|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 1354 (1.3K) [text/plain]...
  • Page 42 --> Processing Dependency: kmod-isert = 1.0-OFED.3.1.0.1.2.1.g832a737.rhel7u1 package: mlnx-ofed-all-3.1-0.1.2.noarch ........qperf.x86_64 0:0.4.9-9 rds-devel.x86_64 0:2.0.7-1.12 rds-tools.x86_64 0:2.0.7-1.12 sdpnetstat.x86_64 0:1.60-26 srptools.x86_64 0:1.0.2-12 Complete! Uninstalling MLNX_OFED Using the YUM Tool Use the script /usr/sbin/ofed_uninstall.sh to uninstall the NVIDIA OFED package. The script is part of the ofed-scripts RPM.
  • Page 43: Installing Mlnx_Ofed Using Apt-Get Tool

    Setting up MLNX_OFED apt-get Repository Log into the installation machine as root. Extract the MLNX_OFED package on a shared location in your network. You can download it from http://www.nvidia.com > Products > Software> InfiniBand/ VPI Drivers. Create an apt-get repository configuration file called "/etc/apt/sources.list.d/mlnx_ofed.list"...
  • Page 44: Updating Firmware After Installation

    The firmware can be updated either manually or automatically (upon system boot), as described in the sections below. Updating the Device Online To update the device online on the machine from the NVIDIA site, use the following command line: mlxfwmanager --online -u -d <device> Example: mlxfwmanager --online -u -d 0000:09:00.0...
  • Page 45: Uefi Secure Boot

    OEM card and now you wish to (manually) update firmware on your adapter card(s), you need to perform the steps below. The following steps are also appropriate in case that you wish to burn newer firmware that you have downloaded from the NVIDIA web site (http:// www.nvidia.com >...
  • Page 46 In order to support loading MLNX_OFED drivers when an OS supporting Secure Boot boots on a UEFI- based system with Secure Boot enabled, the NVIDIA x.509 public key should be added to the UEFI Secure Boot key database and loaded onto the system key ring by the kernel.
  • Page 47: Performance Tuning

    ESXi 6.5 Installer Privileges The installation requires administrator privileges on the target machine. Installing NVIDIA NATIVE ESXi Driver for VMware vSphere  Please uninstall all previous NVIDIA driver packages prior to installing the new version. See Removing Earlier NVIDIA Drivers for further information.
  • Page 48: Removing Earlier Nvidia Drivers

    Please unload the previously installed drivers before removing them. To remove all the drivers: Log into the ESXi server with root permissions. List all the existing NATIVE ESXi driver modules. (See Step 4 in Installing NVIDIA NATIVE ESXi Driver for VMware vSphere.) Remove each module: #>...
  • Page 49  The following procedure requires custom boot image downloading, mounting and booting from a USB device.
  • Page 50: Updating Adapter Firmware

    To check that your card is programmed with the latest available firmware version, download the mlxup firmware update and query utility. The utility can query for available NVIDIA adapters and indicate which adapters require a firmware update. If the user confirms, mlxup upgrades the firmware using embedded images.
  • Page 51: Troubleshooting

    Troubleshooting General Troubleshooting Server unable to find the adapter • Ensure that the adapter is placed correctly • Make sure the adapter slot and the adapter are compatible Install the adapter in a different PCI Express slot • Use the drivers that came with the adapter or download the latest •...
  • Page 52: Windows Troubleshooting

    NVIDIA Firmware Tool (MFT) Download and install MFT: http://www.nvidia.com/content/ pages.php?pg=management_tools&menu_section=34 Refer to the User Manual for installation instructions. Once installed, run: mst start mst status flint -d <mst_device> q Ports Information ibstat ibv_devinfo Firmware Version Upgrade To download the latest firmware version refer to http://...
  • Page 53: Specifications

    Specifications MCX545A-ECAN Specifications Size: 3.07 in. x 4.33 in (78.00mm x 110.05mm) sica Connector: Single QSFP28 InfiniBand and Ethernet (copper and optical) Ethernet: 100GBASE-CR4, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASE-KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER,10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR InfiniBand: IBTA v1.3 Auto-Negotiation : 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane),...
  • Page 54: Mcx545B-Ecan Specifications

    CE / FCC / VCCI / ICES / RCM RoHS RoHS compliant a. ConnectX-5 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product.
  • Page 55: Mcx545M-Ecan Specifications

    CE / FCC / VCCI / ICES / RCM RoHS RoHS compliant a. ConnectX-5 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product.
  • Page 56: Mcx546A-Edan Specifications

    CE / FCC / VCCI / ICES / RCM RoHS RoHS compliant a. ConnectX-5 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. ...
  • Page 57: Board Mechanical Drawing And Dimensions

    CE / FCC / VCCI / ICES / RCM RoHS RoHS compliant a. ConnectX-5 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. ...
  • Page 58  All dimensions are in millimeters. All the mechanical tolerances are +/- 0.1mm.  For the 3D Model of the card, please refer to 3D Models at http://www.nvidia.com/page/ 3d_models. MCX545B-ECAN Mechanical Drawing and Dimensions MCX545A-ECAN and MCX545M-ECAN Mechanical Drawing and Dimensions...
  • Page 59 MCX546A-EDAN Mechanical Drawing and Dimensions...
  • Page 61: Finding The Guid/Mac And Serial Number On The Adapter Card

    Finding the GUID/MAC and Serial Number on the Adapter Card Each NVIDIA adapter card has a different identifier printed on the label: serial number and the card MAC for the Ethernet protocol and the card GUID for the InfiniBand protocol. VPI cards have both a GUID and a MAC (derived from the GUID).
  • Page 62: Mcx545B-Ecan Board Labels (Example)

    MCX545B-ECAN Board Labels (Example) MCX545M-ECAN Board Labels (Example)
  • Page 63: Document Revision History

    Document Revision History Date Revision Description of Changes Sep. 2022 Added a note on FRU EEPROM memory component under Features and Benefits table. Feb. 2021 Added Standby Mode power numbers for passive cables for OPNs MCX545B-ECAN. Jan. 2021 Updated  MCX546A-EDAN Specifications and LED Interface.
  • Page 64 NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
  • Page 65 Copyright © 2022 NVIDIA Corporation & affiliates. All Rights Reserved.

This manual is also suitable for:

Mcx545a-ecanMcx545b-ecanMcx545m-ecanMcx546a-edan

Table of Contents