Page 4
About This Manual This is the User Guide for NVIDIA® Ethernet adapter cards based on the ConnectX®-5 integrated circuit device for Open Compute Project Spec 3.0. These adapters' connectivity provide the highest performing low latency and most flexible interconnect solution for PCI Express Gen 3.0/4.0 servers used in Enterprise Data Centers and High-Performance Computing environments.
Page 5
• URL: https://www.nvidia.com > Support • E-mail: enterprisesupport@nvidia.com Customers who purchased NVIDIA Global Support Services, please see your contract for details regarding Technical Support. Customers who purchased NVIDIA products through an NVIDIA-approved reseller should first seek assistance through their reseller. Related Documentation User Manual and Release Notes describing MLNx_OFED features,...
Page 6
LinkX Interconnect Solutions maximize the performance of High-Performance Computing networks, requiring high-bandwidth, low-latency connections between compute nodes and switch nodes. NVIDIA offers one of the industry’s broadest portfolio of 40GbE, 56GbE, 100GbE, 200GbE and 400GbE cables, including Direct Attach Copper cables (DACs), copper splitter cables, Active Optical Cables (AOCs) and transceivers in a wide range of lengths from 0.5m...
Page 7
Document Conventions When discussing memory sizes, MB and MBytes are used in this document to mean size in mega Bytes. The use of Mb or Mbits (small b) indicates size in mega bits. In this document PCIe is used to mean PCI Express.
OCP spec 3.0 available at that time did not contain any S&V definitions. A newer version of the OCP spec 3.0 has defined S&V specifications and NVIDIA is in the midst of retesting these cards to comply with OCP spec 3.0.
Uses PCIe Gen 3.0 (8GT/s) and Gen 4.0 (16GT/s) through an x8 or x16 edge connector. Gen 1.1 (PCIe) and 2.0 compatible. Up to 100 NVIDIA adapters comply with the following IEEE 802.3 standards: Gigabit • 100GbE/ 50GbE / 40GbE / 25GbE / 10GbE / 1GbE Ethernet •...
Page 10
NVGRE and VXLAN. While this solves network scalability issues, it hides the TCP packet from the hardware offloading engines, placing higher loads on the host CPU. ConnectX-5 effectively addresses this by providing advanced NVGRE and VXLAN hardware offloading engines that encapsulate and de-capsulate the overlay protocol.
PCIe interface into multiple and independent interfaces. By Using NVIDIA Multi Host™, ConnectX-5 lowers the total cost of ownership (TCO) in the data center by reducing CAPEX (cables, NICs, and switch port expenses), and by reducing OPEX by cutting down on switch port management and overall power usage.
Interfaces Ethernet SFP28 and QSFP28 Interfaces The network ports of the ConnectX®-5 adapter card are compliant with the IEEE 802.3 Ethernet standards listed in Features and Benefits. Ethernet traffic is transmitted through the SFP28/QSFP28 connectors on the adapter card. The adapter card includes special circuits to protect from ESD shocks to the card/server when plugging copper cables.
0xA2 and its capacity is 4Kb. Heat Sink Interface A heatsink is attached to the ConnectX-5 IC in order to dissipate the heat from the ConnectX- 5 IC. It is attached either by using four spring-loaded push pins that insert into four mounting holes.
Voltage Regulators The voltage regulator power is derived from the OCP 3.0 edge connector 12V and 3.3V supply pins. These voltage supply pins feed onboard regulators that provide the necessary power to the various components on the card. CPLD Interface The adapter card incorporates a CPLD device that controls the networking port LEDs and the scan chain.
Hardware Installation Installation and initialization of ConnectX-5 adapter cards for OCP Spec 3.0 require attention to the mechanical attributes, power specifications, and precautions for electronic equipment. Safety Warnings Safety warnings are provided here in the English language. For safety warnings in other...
Installation Procedure Overview The installation procedure of ConnectX-5 adapter cards for OCP Spec 3.0 involves the following steps: Step Procedure Direct Link Check the system’s hardware and software requirements. Refer to System Requirements Pay attention to the airflow consideration within the host...
A system with a PCI Express x16 slot for OCP spec 3.0 is required for installing the card. Airflow Requirements ConnectX-5 adapter cards are offered with two airflow patterns: from the heatsink to the network ports, and vice versa, as shown below.
It is strongly recommended to use an ESD strap or other antistatic devices. Pre-Installation Checklist Unpack the ConnectX-5 adapter card. Unpack and remove the ConnectX-5 card. Check the parts for visible damage that may have occurred during shipping. Please note that the cards must be placed on an antistatic surface. ...
Page 19
• The required torx tool type as specified in the instructions. Removing the Existing Bracket Using the torx tool type listed in the below table, remove the screws according to the instructions per OCP 3.0 bracket type. Internal Lock Bracket Pull-tab (Thumbscrew) Bracket ...
Ensure that the insulator's front edge is beneath the bracket, as shown in the below figure. Screw on the OCP 3.0 bracket with the supplied screws that came with the new bracket kit. Use the specified torx tool type and apply the specified torque on the screws per bracket form factor.
Internal Lock Bracket Pull-tab (Thumbscrew) Bracket Ejector-Latch Bracket Push the card until connectors are in a full mate. Internal Lock Bracket Pull-tab (Thumbscrew) Bracket Ejector-Latch Bracket Secure the card. Internal Lock Bracket Pull-tab (Thumbscrew) Ejector-Latch Bracket Bracket A clicking sound is heard once Turn the captive screw clockwise Close the ejector.
Get the device location on the PCI bus by running lspci and locating lines with the string “Mellanox Technologies”: lspci |grep -i Mellanox Network controller: Mellanox Technologies MT28800 Family [ConnectX-5] On Windows Open Device Manager on the server. Click Start => Run, and then enter devmgmt.msc.
In the Value display box, check the fields VEN and DEV (fields are separated by ‘&’). In the display example above, notice the sub-string “PCI\VEN_15B3&DEV_1003”: VEN is equal to 0x15B3 – this is the Vendor ID of NVIDIA; and DEV is equal to 1018 (for ConnectX-5) – this is a valid NVIDIA PCI Device ID.
VMware Driver Installation Windows Driver Installation For Windows, download and install the latest WinOF-2 for Windows software package available via the NVIDIA website at: WinOF-2 webpage. Follow the installation instructions included in the download package (also available from the download page).
Go to the WinOF-2 web page at: https://www.nvidia.com/en-us/networking/ > Products > Software > InfiniBand Drivers (Learn More) > Nvidia WinOF-2. Download the .exe image according to the architecture of your machine (see Step 1). The name of the .exe is in the following format: MLNX_WinOF2-<version>_<arch>.exe.
Page 26
"ERROR!!! Installation failed due to following errors: MlxRshim drivers installation disabled and MlxRshim drivers Installed, Please remove the following oem inf files from driver store: <oem inf list>" [Optional] If you want to skip the check for unsupported devices, run. MLNX_WinOF2_<revision_version>_All_Arch.exe /v"...
Page 27
• If the user has a standard NVIDIA® card with an older firmware version, the firmware will be updated accordingly. However, if the user has both an OEM card and a NVIDIA® card, only the NVIDIA® card will be updated.
Page 28
Select a Complete or Custom installation, follow Step a onward. Select the desired feature to install: • Performances tools - install the performance tools that are used to measure performance in user environment • Documentation - contains the User Manual and Release Notes...
Page 29
• Management tools - installation tools used for management, such as mlxstat • Diagnostic Tools - installation tools used for diagnostics, such as mlx5cmd Click Next to install the desired tools. Click Install to start the installation.
Page 30
In case firmware upgrade option was checked in Step 7, you will be notified if a firmware upgrade is required (see ). Click Finish to complete the installation.
Page 31
Unattended Installation If no reboot options are specified, the installer restarts the computer whenever necessary without displaying any prompt or warning to the user. To control the reboots, use the /norestart or /forcerestart standard command-line options. The following is an example of an unattended installation session. Open a CMD console-> Click Start-> Task Manager File-> Run new task-> and enter CMD.
Firmware Upgrade If the machine has a standard NVIDIA® card with an older firmware version, the firmware will be automatically updated as part of the NVIDIA® WinOF-2 package installation. For information on how to upgrade firmware manually, please refer to MFT User Manual. ...
NVIDIA web site at nvidia.com/en-us/ networking → Products → Software → InfiniBand Drivers → NVIDIA MLNX_OFED Scroll down to the Download wizard, and click the Download tab. Choose your relevant package depending on your host operating system. iii. Click the desired ISO/tgz package.
Page 34
• If you need to install OFED on an entire (homogeneous) cluster, a common strategy is to mount the ISO image on one of the cluster nodes and then copy it to a shared file system such as NFS. To install on all the cluster nodes, use cluster-aware tools (suchaspdsh).
Page 35
For the list of installation options, run: ./mlnxofedinstall --h Installation Procedure This section describes the installation procedure of MLNX_OFED on NVIDIA adapter cards. Log in to the installation machine as root. Mount the ISO image on your machine. host1# mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Run the installation script.
Page 36
FW XX.XX.XXXX Status: No matching image found Error message #2: The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor. d. Case A: If the installation script has performed a firmware update on your network adapter, you need to either restart the driver or reboot your system before the firmware update can take effect.
Page 37
(InfiniBand only) Run the hca_self_test.ofed utility to verify whether or not the InfiniBand link is up. The utility also checks for and displays additional information such as: • HCA firmware version • Kernel architecture • Driver version • Number of active HCA ports along with their states •...
Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0.IBMM2150110033.logs Driver Load Upon System Boot Upon system boot, the NVIDIA drivers will be loaded automatically. To prevent the automatic load of the NVIDIA drivers upon system boot: Add the following lines to the "/etc/modprobe.d/mlnx.conf" file. blacklist mlx5_core blacklist mlx5_ib Set “ONBOOT=no”...
Page 39
"The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor." Installation Logging While installing MLNX_OFED, the install log for each selected package will be saved in a separate log file.
Mount the ISO image on your machine and copy its content to a shared location in your network. # mount -o ro,loop MLNX_OFED_LINUX-<ver>-<OS label>-<CPU arch>.iso /mnt Download and install NVIDIA's GPG-KEY: The key can be downloaded via the following link: http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox # wget http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox...
Page 41
gpgcheck=1 Check that the repository was successfully added. # yum repolist Loaded plugins: product-id, security, subscription-manager This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register. repo id repo name status mlnx_ofed MLNX_OFED Repository rpmforge RHEL 6Server - RPMforge.net - dag 4,597...
Page 42
(User Space packages only where: mlnx-ofed-all Installs all available packages in MLNX_OFED mlnx-ofed-basic Installs basic packages required for running NVIDIA cards mlnx-ofed-guest Installs packages required by guest OS mlnx-ofed-hpc Installs packages required for HPC mlnx-ofed-hypervisor Installs packages required by hypervisor OS...
Page 43
Setting up MLNX_OFED apt-get Repository Log into the installation machine as root. Extract the MLNX_OFED package on a shared location in your network. It can be downloaded from https://www.nvidia.com/en-us/networking/ → Products → Software→ InfiniBand Drivers. Create an apt-get repository configuration file called "/etc/apt/sources.list.d/ mlnx_ofed.list" with the following content: ...
Page 44
Download and install NVIDIA's Technologies GPG-KEY. # wget -qO - http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox | sudo apt-key add - Verify that the key was successfully imported. # apt-key list 1024D/A9E4B643 2013-08-11 Mellanox Technologies <support@mellanox.com> 1024g/09FCC269 2013-08-11 Update the apt-get cache. # sudo apt-get update Setting up MLNX_OFED apt-get Repository Using --add-kernel-support Log into the installation machine as root.
Depending on the application of the user's system, it may be necessary to modify the default configuration of network adapters based on the ConnectX® adapters. In case that tuning is required, please refer to the Performance Tuning Guide for NVIDIA Network Adapters. VMware Driver Installation This section describes VMware Driver Installation.
Removing Earlier NVIDIA Drivers Please unload the previously installed drivers before removing them. To remove all the drivers: Log into the ESXi server with root permissions. List all the existing NATIVE ESXi driver modules. (See Step 4 in Installing NATIVE ESXi Driver for VMware vSphere.)
To check that your card is programmed with the latest available firmware version, download the mlxup firmware update and query utility. The utility can query for available NVIDIA adapters and indicate which adapters require a firmware update. If the user confirms, mlxup upgrades the firmware using embedded images.
Troubleshooting General Troubleshooting • Ensure that the adapter is placed correctly Server unable to find the adapter • Make sure the adapter slot and the adapter are compatible Install the adapter in a different PCI Express slot • Use the drivers that came with the adapter or download the latest •...
-d <mst_device> q ibstat Ports Information ibv_devinfo To download the latest firmware version, refer to Firmware Version Upgrade the NVIDIA Update and Query Utility. cat /var/log/messages Collect Log File dmesg >> system.log journalctl (Applicable on new operating systems) cat /var/log/syslog Windows Troubleshooting...
OCP spec 3.0 available at that time did not contain any S&V definitions. A newer version of the OCP spec 3.0 has defined S&V specifications and NVIDIA is in the midst of retesting these cards to comply with OCP spec 3.0.
Page 52
Safety CB / cTUVus / CE Regulatory CE / FCC / VCCI / ICES / RCM RoHS RoHS compliant a. Typical power for ATIS traffic load. b. The non-operational storage temperature specifications apply to the product without its package.
Board Mechanical Drawing and Dimensions All dimensions are in millimeters. All the mechanical tolerances are +/- 0.1mm. MCX562A-ACAI Drawings MCX562A-ACAB Drawings MCX566A-CCAI Drawings MCX566A-CDAB Drawings MCX566A-CDAI Drawings MCX565M-CDAI/MCX565M-CDAB Drawings...
Unable to render include or excerpt-include. Could not retrieve page. Adapter Card Heatsink A heatsink is attached to the ConnectX-5 IC in order to dissipate the heat from the ConnectX- 5 IC. It is attached either by using four spring-loaded push pins that insert into four mounting holes.
Finding MAC and Serial Number on the Adapter Card Each NVIDIA adapter card has a different identifier printed on the label: serial number and the card MAC for the Ethernet protocol. The product revisions indicated on the labels in the following figures do not necessarily represent the latest revisions of the cards.
Document Revision History Date Description of Changes May. 2023 Added non-operational storage temperature specifications. Jan. 2023 Updated dimensions in board mechanical drawing Nov. 2022 Updated board label example. Sep. 2022 Added a note concerning FRU EEPROM memory component under the Features and Benefits table.
Page 63
NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.
Need help?
Do you have a question about the ConnectX-5 and is the answer not in the manual?
Questions and answers