Nvidia ConnectX-5 User Manual

Ethernet adapter cards for ocp spec 2.0
Hide thumbs Also See for ConnectX-5:
Table of Contents

Advertisement

Quick Links

 
 
 
 
 
 
 
NVIDIA ConnectX-5 Ethernet Adapter Cards
for OCP Spec 2.0 User Manual
25 and 100 Gb/s Ethernet Adapter Cards; Intelligent RDMA (RoCE) enabled NICs. Supporting
OCP Specification 2.0 type 1 and type 2.
 
Exported on Dec/05/2023 11:03 AM

Advertisement

Table of Contents
loading

Summary of Contents for Nvidia ConnectX-5

  • Page 1           NVIDIA ConnectX-5 Ethernet Adapter Cards for OCP Spec 2.0 User Manual 25 and 100 Gb/s Ethernet Adapter Cards; Intelligent RDMA (RoCE) enabled NICs. Supporting OCP Specification 2.0 type 1 and type 2.   Exported on Dec/05/2023 11:03 AM...
  • Page 2: Table Of Contents

    Table of Contents Introduction..................7 OCP Spec 2.0 Stacking Heights..............8 OCP Spec 2.0 Type 1 Stacking Height - Single-port Card........8 OCP Spec 2.0 Type 1 Stacking Height - Dual-port Card ........8 OCP Spec 2.0 Type 2 Stacking Height - Single-port Card........9 OCP Spec 2.0 Type 2 Stacking Height - Dual-port Card ........
  • Page 3 Performance Tuning ................43 VMware Driver Installation ............... 43 Hardware and Software Requirements ............. 43 Installing NATIVE ESXi Driver for VMware vSphere ........44 Removing Earlier NVIDIA Drivers ............. 44 Firmware Programming ............... 45 Updating Adapter Firmware ..............46 Troubleshooting ................47 General Troubleshooting ................
  • Page 4  ...
  • Page 5 EOL'd (End of Life) Ordering Part Numbers The table below provides the ordering part numbers (OPN) for ConnectX-5 Ex and ConnectX-5 Ethernet adapter cards for OCP Spec 2.0. IC in...
  • Page 6 Customers who purchased NVIDIA Global Support Services, please see your contract for details regarding Technical Support. Customers who purchased NVIDIA products through an NVIDIA-approved reseller should first seek assistance through their reseller. Related Documentation User Manual describing OFED features, performance, band MLNX_OFED for Linux User Manual diagnostic, tools content, and configuration.
  • Page 7: Introduction

    Centers and High-Performance Computing environments. The following provides the ordering part number, port speed, number of ports, and PCI Express speed.  ConnectX-5 Ex Ethernet Adapter Cards Model ConnectX-5 Ex Cards for OCP Spec 2.0 Part Number MCX546A-BCAN MCX546A-CDAN Ethernet Data Rate...
  • Page 8: Ocp Spec 2.0 Stacking Heights

    4119 for Physical Function (PF) 4120 for Virtual Function (VF) a. NVIDIA recommends populating MCX542B-ACAN in a standard PCIe x8 OCP connector which exposes PCIe lanes in a straight manner.    In case the OCP slot exposes PCIe lanes in a reversed manner, MCX542B-ACAN supports automatic lane reversal with FW image from April 2019 release and above.
  • Page 9: Ocp Spec 2.0 Type 2 Stacking Height - Single-Port Card

    The dual port 10/25Gb/s Ethernet adapter card comply with OCP Spec 2.0 Type 1 with 8mm stacking height. OCP Spec 2.0 Type 2 Stacking Height - Single-port Card  This section applies to MCX545A-CCAN and MCX545A-CCUN. The single-port 100Gb/s adapter card follows OCP Spec 2.0 Type 2 with 12mm stacking height. OCP Spec 2.0 Type 2 Stacking Height - Dual-port Card ...
  • Page 10: Features And Benefits

    NVGRE and VXLAN. While this solves network scalability issues, it hides the TCP packet from the hardware offloading engines, placing higher loads on the host CPU. ConnectX-5 effectively addresses this by providing advanced NVGRE and VXLAN hardware offloading engines that encapsulate and de-capsulate the overlay protocol.
  • Page 11: Operating Systems/Distributions

    Support for port-based Quality of Service enabling various application requirements for latency and SLA. Hardware-based I/O Virtualization ConnectX-5 provides dedicated adapter resources and guaranteed isolation and protection for virtual machines within the server. Storage Acceleration A consolidated compute and storage network achieves significant cost-performance advantages over multi-fabric networks.
  • Page 12: Interfaces

    The adapter card includes special circuits to protect from ESD shocks to the card/server when plugging copper cables. PCI Express Interface The table below describes the supported PCIe interface in ConnectX-5 and ConnectX-5 Ex adapter cards. IC Model Supported PCIe Interface...
  • Page 13: Smbus Interface

    BMC using MCTP over SMBus or MCTP over PCIe protocols as if it is a standard NVIDIA PCIe stand-up adapter. For configuring the adapter for the specific manageability solution in use by the server, please contact NVIDIA Support.
  • Page 14: Hardware Installation

    Hardware Installation Installation and initialization of ConnectX-5 adapter cards for OCP Spec 2.0 require attention to the mechanical attributes, power specification, and precautions for electronic equipment. Safety Warnings  Safety warnings are provided here in the English language. For safety warnings in other...
  • Page 15: Hardware Requirements

    April 2019 release and above. Airflow Requirements ConnectX-5 adapter cards are offered with two airflow patterns: from the heatsink to the network ports, and vice versa, as shown below. Please refer to the "Specifications" chapter for airflow numbers for each specific card model.
  • Page 16: Adapter Cards Installation Instructions

    Shut down your system if active: Turn off the power to the system, and disconnect the power cord. Refer to the system documentation for instructions. Before you install the ConnectX-5 card, make sure that the system is disconnected from power.
  • Page 17: Cables And Modules

    Applying even pressure on four corners of the card (as shown in the below picture), insert the adapter card into the PCI Express slot until firmly seated. Secure the adapter with the adapter clip or screw. To uninstall the adapter card, see Uninstalling the Card.
  • Page 18: Identifying The Card In Your System

    Get the device location on the PCI bus by running lspci and locating lines with the string “Mellanox Technologies”: lspci |grep -i Mellanox Network controller: Mellanox Technologies MT28800 Family [ConnectX-5] On Windows Open Device Manager on the server. Click Start => Run, and then enter devmgmt.msc.
  • Page 19: Adapter Cards Extraction Instructions

    In the Value display box, check the fields VEN and DEV (fields are separated by ‘&’). In the display example above, notice the sub-string “PCI\VEN_15B3&DEV_1003”: VEN is equal to 0x15B3 – this is the Vendor ID of NVIDIA; and DEV is equal to 1018 (for ConnectX-5) – this is a valid NVIDIA PCI Device ID.
  • Page 20: Card Extraction

    Card Extraction  Please note that the following images are for illustration purposes only. Verify that the system is powered off and unplugged. Wait 30 seconds. To remove the card, disengage clip 1 and 2 on connector A side. To disconnect connector A, gently pull the adapter card upwards. Disengage clip 3 and clip 4 on the adapter card on Connector B side.
  • Page 21 6. To remove the card, gently pull the adapter card upwards.
  • Page 22: Driver Installation

    VMware Driver Installation Windows Driver Installation For Windows, download and install the latest WinOF-2 for Windows software package available via the NVIDIA website at: WinOF-2 webpage. Follow the installation instructions included in the download package (also available from the download page).
  • Page 23: Installing Winof-2 Driver

    Go to the WinOF-2 web page at: https://www.nvidia.com/en-us/networking/ > Products > Software > InfiniBand Drivers (Learn More) > Nvidia WinOF-2. Download the .exe image according to the architecture of your machine (see Step 1).  The name of the .exe is in the following format: MLNX_WinOF2-<version>_<arch>.exe.
  • Page 24 "ERROR!!! Installation failed due to following errors: MlxRshim drivers installation disabled and MlxRshim drivers Installed, Please remove the following oem inf files from driver store: <oem inf list>" [Optional] If you want to skip the check for unsupported devices, run. MLNX_WinOF2_<revision_version>_All_Arch.exe /v"...
  • Page 25 • If the user has a standard NVIDIA® card with an older firmware version, the firmware will be updated accordingly. However, if the user has both an OEM card and a NVIDIA® card, only the NVIDIA® card will be updated.
  • Page 26 Select a Complete or Custom installation, follow Step a onward. Select the desired feature to install: • Performances tools - install the performance tools that are used to measure performance in user environment • Documentation - contains the User Manual and Release Notes...
  • Page 27 • Management tools - installation tools used for management, such as mlxstat • Diagnostic Tools - installation tools used for diagnostics, such as mlx5cmd Click Next to install the desired tools. Click Install to start the installation.
  • Page 28 In case firmware upgrade option was checked in Step 7, you will be notified if a firmware upgrade is required (see  ).  Click Finish to complete the installation.
  • Page 29 Unattended Installation  If no reboot options are specified, the installer restarts the computer whenever necessary without displaying any prompt or warning to the user. To control the reboots, use the /norestart or /forcerestart standard command-line options. The following is an example of an unattended installation session. Open a CMD console-> Click Start-> Task Manager File-> Run new task-> and enter CMD.
  • Page 30: Firmware Upgrade

    Firmware Upgrade If the machine has a standard NVIDIA® card with an older firmware version, the firmware will be automatically updated as part of the NVIDIA® WinOF-2 package installation. For information on how to upgrade firmware manually, please refer to MFT User Manual. ...
  • Page 31: Installing Mlnx_Ofed

    Scroll down to the Download wizard, and click the Download tab. Choose your relevant package depending on your host operating system. Click the desired ISO/tgz package. To obtain the download link, accept the End User License Agreement (EULA).     3. Use the Hash utility to confirm the file integrity of your ISO image. Run the following command and compare the result to the value provided on the download page.
  • Page 32 • If you need to install OFED on an entire (homogeneous) cluster, a common strategy is to mount the ISO image on one of the cluster nodes and then copy it to a shared file system such as NFS. To install on all the cluster nodes, use cluster-aware tools (suchaspdsh). •...
  • Page 33 For the list of installation options, run: ./mlnxofedinstall --h Installation Procedure This section describes the installation procedure of MLNX_OFED on NVIDIA adapter cards.  Log in to the installation machine as root. Mount the ISO image on your machine.  host1# mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Run the installation script.
  • Page 34 FW XX.XX.XXXX Status: No matching image found Error message #2: The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor. 4. Case A: If the installation script has performed a firmware update on your network adapter, you need to either restart the driver or reboot your system before the firmware update can take effect.
  • Page 35 In case your machine has an unsupported network adapter device, no firmware update will occur and the error message below will be printed. "The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor."...
  • Page 36: Driver Load Upon System Boot

    Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0.IBMM2150110033.logs Driver Load Upon System Boot Upon system boot, the NVIDIA drivers will be loaded automatically.  To prevent the automatic load of the NVIDIA drivers upon system boot: Add the following lines to the "/etc/modprobe.d/mlnx.conf" file.  blacklist mlx5_core blacklist mlx5_ib Set “ONBOOT=no”...
  • Page 37: Additional Installation Procedures

    In case your machine has an unsupported network adapter device, no firmware update will occur and the error message below will be printed. "The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor."...
  • Page 38 Mount the ISO image on your machine and copy its content to a shared location in your network. # mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Download and install NVIDIA's GPG-KEY: The key can be downloaded via the following link:  http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox # wget http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox...
  • Page 39 # mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Build the packages with kernel support and create the tarball.  # /mnt/mlnx_add_kernel_support.sh --make-tgz <optional --kmp> -k $(uname -r) -m /mnt/ Note: This program will create MLNX_OFED_LINUX TGZ rhel7.6 under /tmp directory. Do you want to continue?[y/N]:y See log file /tmp/mlnx_iso.4120_logs/mlnx_ofed_iso.4120.log Checking all needed packages are installed...
  • Page 40 (User Space packages only where:  mlnx-ofed-all Installs all available packages in MLNX_OFED mlnx-ofed-basic Installs basic packages required for running NVIDIA cards mlnx-ofed-guest Installs packages required by guest OS mlnx-ofed-hpc Installs packages required for HPC mlnx-ofed-hypervisor Installs packages required by hypervisor OS...
  • Page 41 Create an apt-get repository configuration file called "/etc/apt/sources.list.d/mlnx_ofed.list" with the following content:  deb file:/<path to extracted MLNX_OFED package>/DEBS ./ Download and install NVIDIA's Technologies GPG-KEY.  # wget -qO - http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox | sudo apt-key add - Verify that the key was successfully imported. ...
  • Page 42 # mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Build the packages with kernel support and create the tarball.  # /mnt/mlnx_add_kernel_support.sh --make-tgz <optional --kmp> -k $(uname -r) -m /mnt/ Note: This program will create MLNX_OFED_LINUX TGZ rhel7.6 under /tmp directory. Do you want to continue?[y/N]:y See log file /tmp/mlnx_iso.4120_logs/mlnx_ofed_iso.4120.log Checking all needed packages are installed...
  • Page 43: Performance Tuning

    Depending on the application of the user's system, it may be necessary to modify the default configuration of network adapters based on the ConnectX® adapters. In case that tuning is required, please refer to the Performance Tuning Guide for NVIDIA Network Adapters. VMware Driver Installation This section describes VMware Driver Installation.
  • Page 44: Installing Native Esxi Driver For Vmware Vsphere

    PartnerSupported 2017-01-31  After the installation process, all kernel modules are loaded automatically upon boot. Removing Earlier NVIDIA Drivers  Please unload the previously installed drivers before removing them. To remove all the drivers: Log into the ESXi server with root permissions.
  • Page 45: Firmware Programming

    Reboot the server. Firmware Programming Download the VMware bootable binary images v4.6.0 from the Firmware Tools (MFT) site. ESXi 6.5 File: mft-4.6.0.48-10EM-650.0.0.4598673.x86_64.vib MD5SUM: 0804cffe30913a7b4017445a0f0adbe1 Install the image according to the steps described in the MFT User Manual.  The following procedure requires custom boot image downloading, mounting and booting from a USB device.
  • Page 46: Updating Adapter Firmware

    To check that your card is programmed with the latest available firmware version, download the mlxup firmware update and query utility. The utility can query for available NVIDIA adapters and indicate which adapters require a firmware update. If the user confirms, mlxup upgrades the firmware using embedded images.
  • Page 47: Troubleshooting

    Troubleshooting General Troubleshooting • Ensure that the adapter is placed correctly Server unable to find the adapter • Make sure the adapter slot and the adapter are compatible Install the adapter in a different PCI Express slot • Use the drivers that came with the adapter or download the latest •...
  • Page 48: Linux Troubleshooting

    -d <mst_device> q ibstat Ports Information ibv_devinfo To download the latest firmware version, refer to Firmware Version Upgrade the NVIDIA Update and Query Utility. cat /var/log/messages Collect Log File dmesg >> system.log journalctl (Applicable on new operating systems) cat /var/log/syslog Windows Troubleshooting...
  • Page 49: Specifications

    The non-operational storage temperature specifications apply to the product without its package. MCX542B-ACAN/MCX542B-ACUN Specifications  NVIDIA recommends populating MCX542B-ACAN in a standard PCIe x8 OCP connector which exposes PCIe lanes in a straight manner. In case the OCP slot exposes PCIe lanes in a reversed manner, MCX542B-ACAN supports...
  • Page 50: Mcx545B-Gcun Specifications

    Maximum power available through QSFP28 port: 1.5W Cable Type Heatsink to Port Port to Heatsink Passive Cable 400LFM 400LFM Airflow Active 1.5W Cable 1200LFM (In NVIDIA 0.8W Cables only) Temperature Operational 0°C to 55°C Environmental Non-operational -40°C to 70°C Humidity Operational 10% to 85% relative humidity ...
  • Page 51: Mcx545B-Ccun Specifications

    Voltage: 3.3VAUX, 5VAUX, 12V Power and Airflow Power Cable Type Active Mode Standby Mode Passive Cables 13.77W – Typical Power Maximum Power Passive Cables 16.35W 5VAUX/12V: 5.8W 3.3VAUX: 0.1 1.5W Active Cables 17.85W – Maximum power available through QSFP28 port: 3.5W Cable Type Heatsink to Port to Heatsink...
  • Page 52: Mcx545A-Ccan And Mcx545A-Ccun Specifications

    Cable Type Heatsink to Port Port to Heatsink Passive Cable 400LFM 800LFM Airflow tested in a ducted Active 1.5W Cable NA 400LFM tunnel Temperature Operational 0°C to 55°C Environmental Non-operational -40°C to 70°C Humidity Operational 10% to 85% relative humidity  Non-operational 10% to 90% relative humidity ...
  • Page 53: Mcx546A-Bcan Specifications

    Altitude (Operational) 3050m Safety CB / cTUVus / CE Regulatory CE / FCC / VCCI / ICES / RCM RoHS RoHS compliant a. Typical power for ATIS traffic load. b. Airflow is measured on ambient 55°. c. The non-operational storage temperature specifications apply to the product without its package. MCX546A-BCAN Specifications Size: 3.07 in.
  • Page 54: Mcx546A-Cdan Specifications

    Passive Cables 24.3W 3.5W Active Cables 32.11W Maximum power available through QSFP28 port: 3.5W Heatsink to Port Port to Heatsink Passive Cable 400LFM 400LFM Airflow NVIDIA Active Not supported 1200LFM Cables Temperature Operational 0°C to 55°C Environmenta Non-operational -40°C to 70°C Humidity Operational 10% to 85% relative humidity ...
  • Page 55 MCX545B-CCUN MCX545A-CCAN and MCX545A-CCUN MCX546A-CDAN and MCX546A-BCAN...
  • Page 56: Monitoring

    Unable to render include or excerpt-include. Could not retrieve page. Adapter Heatsink  A heatsink is attached to the ConnectX-5 IC in order to dissipate the heat from the ConnectX- 5 IC. It is attached either by using four spring-loaded push pins that insert into four mounting holes or by screws.
  • Page 57: Finding The Mac And Serial Number On The Adapter Card

    Finding the MAC and Serial Number on the Adapter Card Each NVIDIA adapter card has a different identifier printed on the label: serial number and the card MAC for the Ethernet protocol.  The product revisions indicated on the labels in the following figures do not necessarily represent the latest revisions of the card.
  • Page 58: Document Revision History

    Document Revision History Date Description of Changes May. 2023 Added non-operational storage temperature specifications. Sep. 2022 Added a note on VPD EEPROM memory component under the Features and Benefits table. Feb. 2021 Added Standby Mode power numbers for passive cables for OPNs MCX545B- CCUN, MCX542B-AC[A/U]N, and MCX545B-GCUN.
  • Page 59 NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
  • Page 60 Copyright © 2023 NVIDIA Corporation & affiliates. All Rights Reserved.

Table of Contents