Table of Contents

Advertisement

Quick Links

NVIDIA BlueField DPU BSP v4.5.3 LTS

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the BlueField DPU BSP v4.5.3 LTS and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for Nvidia BlueField DPU BSP v4.5.3 LTS

  • Page 1 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 2: Table Of Contents

    System Con guration and Services System Con guration and Services Host-side Interface Con guration Host-side Interface Con guration Secure Boot Secure Boot UEFI Secure Boot UEFI Secure Boot Updating Platform Firmware Updating Platform Firmware Management Management NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 3 Public Key Acceleration Public Key Acceleration IPsec Functionality IPsec Functionality fTPM over OP-TEE fTPM over OP-TEE QoS Con guration QoS Con guration VirtIO-net Emulated Devices VirtIO-net Emulated Devices Shared RQ Mode Shared RQ Mode NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 4 General Troubleshooting General Troubleshooting Installation Troubleshooting and How-Tos Installation Troubleshooting and How-Tos Windows Support Windows Support Document Revision History Document Revision History Legal Notices and 3rd Party Licenses Legal Notices and 3rd Party Licenses NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 5 About This Document NVIDIA® BlueField® DPU software is built from the BlueField BSP (Board Support Package) which includes the operating system and the DOCA framework. BlueField BSP includes the bootloaders and other essentials for loading and setting software components. The BSP loads the o cial BlueField operating system (Ubuntu reference Linux distribution) to the DPU.
  • Page 6 Make sure to perform a graceful shutdown of the Arm OS in advance of performing system/host power cycle when required by the manual. Customers who purchased NVIDIA products directly from NVIDIA are invited to contact us through the following methods: E-mail: enterprisesupport@nvidia.com...
  • Page 7 Data encryption key DHCP Dynamic host con guration protocol Direct memory access DOCA DPU SDK DORA Discover; O er; Request; Acknowledgment Device ownership transfer Data path accelerator; a n auxiliary processor designed to accelerate data-path operations NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 8 "Arm host". Host Server host OS refers to the Host Server OS (Linux or Windows) Arm host refers to the AARCH64 Linux OS which is running on the BlueField Arm Cores NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 9 Memory-mapped I/O Most signi cant bit Memory subsystem Mellanox software tools Network address translation Network interface card NIST National Institute of Standards and Technology Namespace On-chip debugger Out-of-band Operating system Open vSwitch Peak burst size NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 10 Ethernet and RDMA over converged Ethernet Receive queue RShim Random Shim Round-trip time Receive Security association SBSA Server base system architecture Software development kit Sub-function or scalable function Scatter-gather Secure hash algorithm SMMU System memory management unit NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 11 Uni ed extensible rmware interface UPVS UEFI persistent variable store Virtual function Virtio full emulation Virtual machine Virtual protocol interconnect Virtual switch tagging WorkQ Work queue workq Work queue elements Write WRDB Write data bu er NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 12: Related Documentation

    See WinOF Release Notes and User Manual NVIDIA This document provides general information concerning the BMC on BlueField BMC the NVIDIA® BlueField® DPU, and is intended for those who want to Software User familiarize themselves with the functionality provided by the BMC Manual...
  • Page 13 BlueField In niBand/Ethernet DPUs User Guide The NVIDIA DOCA™ SDK enables developers to rapidly create applications and services on top of NVIDIA® BlueField® data processing NVIDIA DOCA units (DPUs), leveraging industry-standard APIs. With DOCA, developers can deliver breakthrough networking, security, and storage performance by harnessing the power of NVIDIA's DPUs.
  • Page 14 This document is intended to guide a new crypto application developer or a public key user space driver. It o ers programmers the basic Programming information required to code their own PKA-based application for Guide NVIDIA® BlueField® DPU. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 15: Release Notes

    Release Notes The release note pages provide information for NVIDIA® BlueField® DPU family software such as changes and new features, supported platforms, and reports on software known issues as well as bug xes. Changes and New Features Supported Platforms and Interoperability...
  • Page 16: Supported Platforms And Interoperability

    IB; Single-port QSFP112; PCIe Gen5.0 x16; 8 Arm cores; 00NN- 16GB on board DDR; integrated BMC; Crypto Disabled 900- NVIDIA BlueField-3 B3220 P-Series FHHL DPU; 200GbE (default mode) / 9D3B6- NDR200 IB; Dual-port QSFP112; PCIe Gen5.0 x16 with x16 PCIe 00CV- extension option;...
  • Page 17 Description 900- NVIDIA BlueField-3 B3140L E-Series FHHL DPU; 400GbE / NDR IB (default 9D3B4- mode); Single-port QSFP112; PCIe Gen5.0 x16; 8 Arm cores; 16GB on- 00PN- board DDR; integrated BMC; Crypto Disabled 900- NVIDIA BlueField-3 B3240 P-Series Dual-slot FHHL DPU; 400GbE / NDR 9D3B6- IB (default mode);...
  • Page 18 IB; Single-port QSFP112; PCIe Gen5.0 x16; 8 Arm cores; 00EN- 16GB on board DDR; integrated BMC; Crypto Enabled 900- NVIDIA BlueField-3 B3140L E-Series FHHL DPU; 400GbE / NDR IB (default 9D3B4- mode); Single-port QSFP112; PCIe Gen5.0 x16; 8 Arm cores; 16GB on- 00EN- board DDR;...
  • Page 19 00CC- extension option; 16 Arm cores; 32GB on-board DDR; integrated BMC; Crypto Enabled 900- NVIDIA BlueField-3 B3220SH E-Series FHHL Storage Controller; 200GbE 9D3C6- (default mode) / NDR200 IB; Dual-port QSFP112; PCIe Gen5.0 x16 with 00CV- x16 PCIe extension option; 16 Arm cores; 48GB on-board DDR;...
  • Page 20 BMC; PCIe Gen4 x8; Secure Boot Enabled; Crypto Enabled; 0083- 0765 32GB on-board DDR; 1GbE OOB management; FHHL AECOT 900- MBF2 MT_0 BlueField-2 P-Series DPU 100GbE Dual-Port QSFP56; 9D208- H536C 00000 integrated BMC; PCIe Gen4 x16; Secure Boot Enabled with NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 21: Embedded Software

    0766 32GB on-board DDR; 1GbE OOB management; FHHL AESOT Embedded Software The BlueField DPU installation DOCA local repo package for DPU for this release is DOCA_2.5.2_BSP_4.5.3_Ubuntu_22.04-2.23-07.prod.bfb The following software components are embedded in it: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 22 For more information about embedded software components and drivers, refer to the DOCA 2.5.3 Release Notes. Supported DPU Linux Distributions (aarch64) Ubuntu 22.04 Supported DPU Host OS Distributions The default operating system of the BlueField DPU (Arm) is Ubuntu 22.04. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 23 Oracle Linux 8.7 5.15 RHEL/Rocky Linux 5.14 x86 / BCLinux 21.10 SP2 4.19.90 aarch64 x86 / CTYunOS2.0 4.19.90 aarch64 Debian 10.9 4.19.0-16 x86 / Debian 11.3 5.10.0-13 aarch64 x86 / Debian 12.1 6.1.0-10 aarch64 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 24 RHEL/Rocky 9.0 5.14.0-70.46.1.el9_0 aarch64 x86 / RHEL/Rocky 9.1 5.14.0-162.19.1.el9_1 aarch64 x86 / RHEL/Rocky 9.2 5.14.0-284.11.1.el9_2 aarch64 x86 / RHEL/Rocky 9.3 5.14.0-362.8.1.el9_3 aarch64 x86 / RHEL/Rocky 9.4 5.14.0-427.13.1.el9_4 aarch64 x86 / SLES 15 SP3 5.3.18-57 aarch64 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 25: Bug Fixes In This Version

    Keywords: Monitoring; sensor Reported in version: 4.5.0 Description: Fixed the PMD crash while dumping a rule with invalid rule pointer. Check the validity of the pointer. Keywords: Invalid rule pointer; dumping rule Reported in version: 4.5.0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 26: Known Issues

    Example for a BlueField-2 device: $ mlx-mkbfb -x default.bfb $ mlx-mkbfb \ --bl2r-v1=dump-bl2r-v1 \ --bl2r-cert-v1=dump-bl2r-cert-v1 \ --bl2-v1=dump-bl2-v1 \ --bl2-cert-v1=dump-bl2-cert-v1 \ --bl31-v1=dump-bl31-v1 \ --bl31-cert-v1=dump-bl31-cert-v1 \ --bl31-key-cert-v1=dump-bl31-key-cert-v1 \ --bl33-v0=dump-bl33-v0 \ --bl33-cert-v1=dump-bl33-cert-v1 \ --bl33-key-cert-v1=dump-bl33-key-cert-v1 \ --boot-acpi-v0=dump-boot-acpi-v0 \ --boot-args-v0=dump-boot-args-v0 \ NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 27 Description: On Debian 12, Arm ports remain in Legacy mode after multiple Arm /var/log/syslog reboot iterations. The following error message appears in mlnx_bf_configure[ ]: ERR: Failed to configure switchdev 2601 mode   after retries 0000 00.0 Workaround: Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 28 Keyword: fTPM over OP-TEE Reported in version: 4.5.0 Description: Debian 12 OS does not support CT tunnel o oad. CONFIG_NET_TC_SKB_EXT Workaround: Recompile the kernel with set. Keyword: Connection tracking; Linux Reported in version: 4.5.0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 29 Workaround: Restart the driver on the host after the DPU is up. Keyword: Reboot; VFs Reported in version: 4.5.0 Description: When NIC subsystem is in recovery mode, the interface towards to NVMe is not accessible. Thus, the SSD boot device would not be available. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 30 OP-TEE feature is not functional. Workaround: Reboot the DPU. Keyword: fTPM over OP-TEE Reported in version: 4.5.0 Description: XFRM rules must be deleted before driver restart or warm reboot are performed. Workaround: N/A Keyword: IPsec NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 31 Description: When the public key is deleted while Red sh is enabled, UEFI secure SecureBootEnable boot is disabled and UEFI reverts to Setup Mode (i.e., the false Red sh property is reset to ). If later, the public key is re-enrolled, the NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 32 Description: Downgrading BSP software from 4.2.0 fails if UEFI secure boot is enabled. Workaround: Disable UEFI secure boot before downgrading. Keyword: Software; downgrade Reported in version: 4.2.0 Description: Virtio hotplug is not supported in GPU-HOST mode on the NVIDIA Converged Accelerator. Workaround: N/A Keyword: Virtio; Converged Accelerator Reported in version: 4.2.0 Description: PXE boot over ConnectX interface might not work due to an invalid MAC address in the UEFI boot entry.
  • Page 33 Workaround: Stop virtio-net-controller service before cleaning up bond con guration. Keyword: Virtio-net; LAG Reported in version: 4.2.0 mlxfwreset Description: is not supported in this release. Workaround: Perform graceful shutdown and power cycle the host. Keyword: mlxfwreset; support Reported in version: 4.0.2 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 34 SF with same ID number in a stressful manner may cause the setup to hang due to a race between create and delete commands. Workaround: N/A mlnx-sf Keywords: Hang; Reported in version: 4.0.2 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 35 Workaround: Reload the host driver or reboot the host. Keywords: Modes of operation; driver Reported in version: 4.0.2 Description: When an NVMe controller, SoC management controller, and DMA controller are con gured, the maximum number of VFs is limited to 124. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 36 Workaround: N/A BootOptionEnabled Keywords: Red sh; Reported in version: 3.9.2 ethtool -I --show-fec Description: The command is not supported by the DPU with kernel 5.4. Workaround: N/A Keywords: Kernel; show-fec Reported in version: 3.9.0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 37 Description: Running I/O tra c and toggling both physical ports status in a stressful manner on the receiving-end machine may cause tra c loss. Workaround: N/A Keywords: MLNX_OFED; RDMA; port toggle Reported in version: 3.8.5 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 38 SFs or VFs. Workaround: N/A Keywords: Firmware; SF; VF Reported in version: 3.8.0 Description: Some devlink commands are only supported by mlnx devlink ( /opt/mellanox/iproute2/sbin/devlink ). The default devlink from the OS NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 39 Description: When a device is hot-plugged from the virtio-net controller, the host OS may hang when warm reboot is performed on the host and Arm at the same time. Workaround: Reboot the host OS rst and only then reboot DPU. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 40 OPENSSL_CONF is aimed at using a custom con g le for applications. In this case, it is used to point to a con g le where dynamic engine (PKA engine) is not enabled. Keywords: OpenSSL; curl NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 41 Reported in version: 3.5.0.11563 Description: On a BlueField device operating in Embedded CPU mode, PXE driver will fail to boot if the Arm side is not fully loaded and the OVS bridge is not con gured. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 42: Validated And Supported Cables And Modules

    Description: De ning namespaces with certain Micron disks (Micron_9300_MTFDHAL3T8TDP) using consecutive attach-ns commands can cause errors. Workaround: Add delay between attach-ns commands. Keywords: Micron; disk; namespace; attach-ns Reported in version: 2.2.0.11000 Validated and Supported Cables and Modules Cables Lifecycle Legend NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 43 Prelimi 9I08T- DQ8FNM NVIDIA Select 400GbE QSFP-DD AOC 50m nary 00W050 050-NML 980- NVIDIA Active copper splitter cable, IB MCA7J6 Prototy 9I81B- twin port NDR 800Gb/s to 2x400Gb/s, 5-N004 00N004 OSFP to 2xQSFP112, 4m NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 44 NVIDIA passive copper splitter cable, IB MCP7Y4 9I75R- twin port NDR 800Gb/s to 4x200Gb/s, P-Rel 0-N003 00N003 OSFP to 4xQSFP112, 3m 980- MCP7Y4 NVIDIA passive copper splitter cable, IB P-Rel 9I75D- 0-N01A twin port NDR 800Gb/s to 4x200Gb/s, NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 45 9I75S- twin port NDR 800Gb/s to 4x200Gb/s, P-Rel 0-N02A 00N02A OSFP to 4xQSFP112, 2.5m 980- MFP7E1 NVIDIA passive ber cable, MMF , MPO12 9I73U- 0-N003 APC to MPO12 APC, 3m 000003 980- MFP7E1 NVIDIA passive ber cable, MMF, MPO12 9I73V-...
  • Page 46 APC to MPO12 APC, 1m 00N001 980- MFP7E3 NVIDIA passive ber cable, SMF, MPO12 9I559- 0-N002 APC to MPO12 APC, 2m 00N002 980- MFP7E3 NVIDIA passive ber cable, SMF, MPO12 9I55A- 0-N003 APC to MPO12 APC, 3m NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 47 APC to MPO12 APC, 60m 00N050 980- MFP7E3 NVIDIA passive ber cable, SMF, MPO12 9I582- 0-N070 APC to MPO12 APC, 70m 00N050 980- MFP7E3 NVIDIA passive ber cable, SMF, MPO12 9I58I- 0-N100 APC to MPO12 APC, 100m 00N100 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 48 NVIDIA passive ber cable, SMF, MPO12 9I56R- 0-N050 APC to 2xMPO12 APC, 50m 000050 980- NVIDIA single port transceiver, MMA1Z0 9I693- 400Gbps,NDR, QSFP112, MPO12 APC, P-Rel 0-NS400 00NS00 850nm MMF, up to 50m, at top HDR / 200GbE Cables NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 49 NVIDIA Passive Copper cable, 200GbE, 200G 9I54H- 200Gb/s, QSFP56, LSZH, 0.5m, black V00AE3 [HVM] 00V00A pulltab, 30AWG MCP16 980- NVIDIA Passive Copper cable, 200GbE, 200G 9I54I- 200Gb/s, QSFP56, LSZH, 1.5m, black V01AE3 [HVM] 00V01A pulltab, 30AWG NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 50 200GbE 200Gb/s to 2x100Gb/s, QSFP56 V01AR3 [HVM] 00V01A to 2xQSFP56, colored, 1.5m, 30AWG MCP7H 980- NVIDIA passive copper hybrid cable, 200G 9I98M- 200GbE 200Gb/s to 2x100Gb/s, QSFP56 V02AR2 [HVM] 00V02A to 2xQSFP56, colored, 2.5m, 26AWG NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 51 9I93N- 400(2x200)Gbps to 4x100Gbps, OSFP to 00H001 H001 4xQSFP56, 1m, n to at 980- MCP7Y NVIDIA passive copper splitter cable, 200G 9I93O- 400(2x200)Gbps to 4x100Gbps, OSFP to 00H002 H002 4xQSFP56, 2m, n to at NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 52 NVIDIA passive copper splitter cable, 200G 9I47P- 400(2x200)Gbps to 4x100Gbps, OSFP to 00H01A H01A 4xQSFP56, 1.5m, n to at 980- MFS1S0 NVIDIA active ber cable, IB HDR, up to 9I124- 200Gb/s, QSFP56, LSZH, black pulltab, 3m [HVM] 00H003 H003E 980- MFS1S0 200G...
  • Page 53 MFS1S0 200G Nvidia active optical cable, up to 200Gbps 9I440- , QSFP56 to QSFP56, 30m 00H030 H030V 980- MFS1S0 NVIDIA active ber cable, IB HDR, up to 9I455- 200Gb/s, QSFP56, LSZH, black pulltab, [HVM] 00H050 H050E 980- MFS1S0 200G Nvidia active optical cable, up to 200Gbps...
  • Page 54 9I44W- 200Gb/s, QSFP56, LSZH, black pulltab, 0-V100E [HIBERN/ 00V100 100m ATE] 980- MFS1S5 NVIDIA active ber splitter cable, IB HDR, 9I452- 200Gb/s to 2x100Gb/s, QSFP56 to [HVM] 00H003 H003E 2xQSFP56 , LSZH, 3m 980- MFS1S5 Nvidia active optical splitter cable,...
  • Page 55 NVIDIA Legacy LifeCycle Data Data Description Phase Rate Rate 980- MFS1S5 NVIDIA active ber splitter cable, IB HDR, 9I95E- 200Gb/s to 2x100Gb/s, QSFP56 to [HVM] 00H015 H015E 2xQSFP56 , LSZH, 15m 980- MFS1S5 Nvidia active optical splitter cable, 200G 9I96H-...
  • Page 56 MFS1S5 9I95V- 200Gb/s to 2x100Gb/s, QSFP56 to 0-V030E [HVM] 00V030 2xQSFP56, LSZH, black pulltab, 30m 980- MFS1S9 NVIDIA active ber splitter cable, IB HDR, 9I961- 2x200Gb/s to 2x200Gb/s, 2xQSFP56 to [HVM] 00H010 H010E 2xQSFP56 , LSZH, 10m 980- MFS1S9 NVIDIA active ber splitter cable, IB HDR,...
  • Page 57 NVIDIA Passive Copper cable, ETH 100G 9I622- 100GbE, 100Gb/s, QSFP, 3m, LSZH, EOL [MP] 00C003 C003LZ 26AWG MCP160 980- NVIDIA Passive Copper cable, ETH 100G 9I625- 100GbE, 100Gb/s, QSFP28, 5m, Black, C005E2 00C005 26AWG, CA-L NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 58 100G MCP160 EOL [P- 9I62M- 100GbE, 100Gb/s, QSFP, PVC, 3.5m 0-C03A Rel] 00C03A 26AWG 980- 100G MCP160 NVIDIA Passive Copper cable, IB EDR, up 9I62P- 0-E001 to 100Gb/s, QSFP, LSZH, 1m 30AWG [HVM] 00C001 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 59 NVIDIA Legacy LifeCycle Data Data Description Phase Rate Rate 980- MCP160 NVIDIA Passive Copper cable, IB EDR, up 9I62Q- to 100Gb/s, QSFP28, 1m, Black, 30AWG 00E001 E001E30 980- 100G MCP160 NVIDIA Passive Copper cable, IB EDR, up 9I62S- 0-E002 to 100Gb/s, QSFP, LSZH, 2m 28AWG...
  • Page 60 Legacy LifeCycle Data Data Description Phase Rate Rate 980- 100G MCP160 NVIDIA Passive Copper cable, IB EDR, up 9I623- 0-E01A to 100Gb/s, QSFP, LSZH, 1.5m 30AWG [HVM] 00C01A MCP160 980- NVIDIA Passive Copper cable, IB EDR, up 9I624- E01AE3 to 100Gb/s, QSFP28, 1.5m, Black, 30AWG...
  • Page 61 NVIDIA® passive copper hybrid cable, ETH 100G Prelimina 9I61C- 100Gb/s to 2x50Gb/s, QSFP28 to 00C005 G00000 2xQSFP28, 5m, Colored, 26AWG, CA-L 100G 980- MCP7H0 NVIDIA passive copper hybrid cable, ETH 9I61D- 0-G001 100Gb/s to 2x50Gb/s, QSFP28 to [HVM] NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 62 100Gb/s to 2x50Gb/s, QSFP28 to G003R3 [HVM] 00C003 2xQSFP28, 3m, Colored, 30AWG, CA-L MCP7H0 980- NVIDIA passive copper hybrid cable, ETH 100G 9I99S- 100Gb/s to 2x50Gb/s, QSFP28 to G004R2 [HVM] 00C004 2xQSFP28, 4m, Colored, 26AWG, CA-L NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 63 100Gb/s, QSFP, LSZH, 15m 00C015 980- 100G MFA1A0 NVIDIA active ber cable, ETH 100GbE, 9I13F- 0-C020 100Gb/s, QSFP, LSZH, 20m 00C020 100G 980- MFA1A0 NVIDIA active ber cable, ETH 100GbE, 9I13N- 0-C030 100Gb/s, QSFP, LSZH, 30m NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 64 NVIDIA active ber cable, ETH 100GbE, 9I13B- 0-C100 100Gb/s, QSFP, LSZH, 100m [HVM] 00C100 980- MFA1A0 NVIDIA active ber cable, IB EDR, up to 9I13D- 0-E001 100Gb/s, QSFP, LSZH, 1m 00E001 980- MFA1A0 NVIDIA active ber cable, IB EDR, up to...
  • Page 65 NVIDIA Legacy LifeCycle Data Data Description Phase Rate Rate 980- MFA1A0 NVIDIA active ber cable, IB EDR, up to 9I135- 0-E100 100Gb/s, QSFP, LSZH, 100m [HVM] 00E100 980- NVIDIA active ber hybrid solution, ETH 100G MFA7A2 9I37H- 100GbE to 2x50GbE, QSFP28 to...
  • Page 66 NVIDIA® transceiver, 100GbE, QSFP28, Prelimina 9I17D- MPO, 850nm, up to 100m, OTU4 00CS00 C100T 980- MMA1B NVIDIA transceiver, IB EDR, up to 100Gb/s, 9I17L- 00-E100 QSFP28, MPO, 850nm, SR4, up to 100m 00E000 980- NVIDIA optical transceiver, 100GbE, 100G MMA1L...
  • Page 67 MC2207 NVIDIA active ber cable, VPI, up 56GE 9I15X- EOL [HVM] 31V-015 to 56Gb/s, QSFP, 15m 00L015 56GE 980- MC2207 NVIDIA active ber cable, VPI, up EOL [HVM] 9I15Y- 31V-020 to 56Gb/s, QSFP, 20m NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 68 NVIDIA passive copper cable, VPI, 56GE 9I678- EOL [P-Rel] L-F00A up to 56Gb/s, QSFP, LSZH, 0.5m 00L00A 980- EOL [P-Rel] MCP170 NVIDIA passive copper cable, VPI, 56GE 9I679- [HIBERN/ATE L-F01A up to 56Gb/s, QSFP, LSZH, 1.5m 00L01A NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 69 NVIDIA passive copper hybrid cable, EOL [HVM] 10GE 9I65U- 9130- ETH 10GbE, 10Gb/s, QSFP to SFP+, [HIBERN/A 00J00A 0.5m 980- MC330 NVIDIA passive copper cable, ETH 10GE 9I682- 9124- EOL [HVM] 10GbE, 10Gb/s, SFP+, 4m 00J004 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 70 NVIDIA passive copper cable, ETH EOL [HVM] 10GE 9I68B- 10GbE, 10Gb/s, SFP+, 2m, Blue Pulltab, [HIBERN/A 00J002 X002B Connector Label 10GE 980- MCP21 NVIDIA passive copper cable, ETH EOL [HVM] 9I68C- 10GbE, 10Gb/s, SFP+, 3m, Blue Pulltab, NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 71 NVIDIA optical module, ETH 10GbE, MFM1T0 10GE 02A-SR- 10Gb/s, SFP+, LC-LC, 850nm, SR up to 2A-SR-F 300m MFM1T NVIDIA optical module, ETH 10GbE, MFM1T0 10GE 02A-SR- 10Gb/s, SFP+, LC-LC, 850nm, SR up to 2A-SR-P 300m 10GbE Cables NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 72 00J006 980- MC330 NVIDIA passive copper cable, ETH 10GE 9I685- 9124- EOL [HVM] 10GbE, 10Gb/s, SFP+, 7m 00J007 10GE 980- MC330 NVIDIA passive copper cable, ETH EOL [HVM] 9I686- 9130- 10GbE, 10Gb/s, SFP+, 1m NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 73 9I68F- 10GbE, 10Gb/s, SFP+, 2m, Black EOL [HVM] 00J002 X002B Pulltab, Connector Label 980- MCP21 NVIDIA passive copper cable, ETH 10GE 9I68G- 10GbE, 10Gb/s, SFP+, 3m, Black EOL [HVM] 00J003 X003B Pulltab, Connector Label NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 74 NVIDIA Optical module, ETH 1GbE, MC3208 EOL [P- 9I270- 1Gb/s, SFP, LC-LC, SX 850nm, up to 011-SX Rel] 00IM00 500m 980- MC3208 NVIDIA module, ETH 1GbE, 1Gb/s, SFP, 9I251- 411-T Base-T, up to 100m 00IS00 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 75 FTLC9152RGPL 100G 100M QSFP28 SWDM4 OPT TRANS 100G FTLC9555REP 100m Parallel MMF 100GQSFP28Optical Transceiver M3-E6 100G NDAAFJ-C102 SF-NDAAFJ100G-005M 100G QSFP-100G- 30m (98ft) Cisco QSFP-100G-AOC30M Compatible 100G AOC30M QSFP28 Active Optical Cable 100G QSFP28-LR4-AJ CISCO-PRE 100GbE LR4 QSFP28 Transceiver Module NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 76 Cisco 40GBASE-SR-BiDi, duplex MMF Supported Cables and Modules for BlueField-2 NDR / 400GbE Cables Part Number Marketing Description MCP1660- NVIDIA Direct Attach Copper cable, 400GbE, 400Gb/s, QSFP-DD, W001E30 1m, 30AWG MCP1660- NVIDIA Direct Attach Copper cable, 400GbE, 400Gb/s, QSFP-DD, W002E26...
  • Page 77 Part Number Marketing Description W02AE26 2.5m, 26AWG MCP7F60- NVIDIA DAC splitter cable, 400GbE, 400Gb/s to 4x100Gb/s, QSFP- W001R30 DD to 4xQSFP56, 1m, 30AWG MCP7F60- NVIDIA DAC splitter cable, 400GbE, 400Gb/s to 4x100Gb/s, QSFP- W002R26 DD to 4xQSFP56, 2m, 26AWG MCP7F60-...
  • Page 78 Part Marketing Description Number MFS1S00- NVIDIA active ber cable, 200GbE, 200Gb/s, QSFP56, LSZH, black V050E pulltab, 50m MFS1S00- NVIDIA active ber cable, 200GbE, 200Gb/s, QSFP56, LSZH, black V100E pulltab, 100m MCP1650- NVIDIA Passive Copper cable, 200GbE, 200Gb/s, QSFP56, LSZH, 1m,...
  • Page 79 NVIDIA Passive Copper cable, 200GbE, 200Gb/s, QSFP56, LSZH, V02AE26 2.5m, black pulltab, 26AWG MCP7H50- NVIDIA passive copper hybrid cable, 200GbE 200Gb/s to 2x100Gb/s, V001R30 QSFP56 to 2xQSFP56, colored, 1m, 30AWG MCP7H50- NVIDIA passive copper hybrid cable, 200GbE 200Gb/s to 2x100Gb/s,...
  • Page 80 Spee Part Number Marketing Description MCP1600- NVIDIA passive copper cable, ETH 100GbE, 100Gb/s, QSFP28, 1m, C001E30N black, 30AWG, CA-N MCP1600- NVIDIA passive copper Cable, ETH 100GbE, 100Gb/s, QSFP, 1m, C001LZ LSZH, 30AWG MCP1600- NVIDIA passive copper cable, ETH 100GbE, 100Gb/s, QSFP, PVC,...
  • Page 81 MCP1600- NVIDIA passive copper cable, ETH 100GbE, 100Gb/s, QSFP, PVC, C03A 3.5m 26AWG MCP1600- NVIDIA passive copper cable, IB EDR, up to 100Gb/s, QSFP, LSZH, E001 1m 30AWG MCP1600- NVIDIA passive copper cable, IB EDR, up to 100Gb/s, QSFP, LSZH,...
  • Page 82 Spee Part Number Marketing Description MCP7F00- NVIDIA passive copper hybrid cable, ETH 100GbE to 4x25GbE, A01AR30N QSFP28 to 4xSFP28, 1.5m, colored, 30AWG, CA-N MCP7F00- NVIDIA passive copper hybrid cable, ETH 100GbE to 4x25GbE, A02AR26N QSFP28 to 4xSFP28, 2.5m, colored, 26AWG, CA-N...
  • Page 83 NVIDIA passive copper hybrid cable, ETH 100Gb/s to 2x50Gb/s, G02AR30L QSFP28 to 2xQSFP28, 2.5m, colored, 30AWG, CA-L MFA1A00- NVIDIA active ber cable, ETH 100GbE, 100Gb/s, QSFP, LSZH, 3m C003 MFA1A00- NVIDIA active ber cable, ETH 100GbE, 100Gb/s, QSFP, LSZH, 5m...
  • Page 84 FDR / 56GbE Cables Spee Part Number Marketing Description 56Gb MC2207126- NVIDIA passive copper cable, VPI, up to 56Gb/s, QSFP, 4m 56Gb MC2207128- NVIDIA passive copper cable, VPI, up to 56Gb/s, QSFP, 3m 56Gb MC2207128- NVIDIA passive copper cable, VPI, up to 56Gb/s, QSFP, 2.5m...
  • Page 85 NVIDIA active ber cable, VPI, up to 56Gb/s, QSFP, 75m 56Gb MC220731V- NVIDIA active ber cable, VPI, up to 56Gb/s, QSFP, 100m 56Gb MCP1700- NVIDIA passive copper cable, VPI, up to 56Gb/s, QSFP, 1m, Red F001C pull-tab NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 86 NVIDIA passive copper cable, VPI, up to 56Gb/s, QSFP, 1m, F001D Yellow pull-tab 56Gb MCP1700- NVIDIA passive copper cable, VPI, up to 56Gb/s, QSFP, 2m, Red F002C pull-tab 56Gb MCP1700- NVIDIA passive copper cable, VPI, up to 56Gb/s, QSFP, 2m,...
  • Page 87 2.5m, black pulltab, 26AWG FDR10 / 40GbE Cables Part Marketing Description Number MC220612 NVIDIA passive copper cable, VPI, up to 40Gb/s, QSFP, 4m 8-004 MC220612 NVIDIA passive copper cable, VPI, up to 40Gb/s, QSFP, 5m 8-005 MC220613 NVIDIA passive copper cable, VPI, up to 40Gb/s, QSFP, 1m...
  • Page 88 0-100 MC221041 NVIDIA optical module, 40Gb/s, QSFP, MPO, 850nm, up to 300m 1-SR4E MC260912 NVIDIA passive copper hybrid cable, ETH 40GbE to 4x10GbE, QSFP to 5-005 4xSFP+, 5m MC260913 NVIDIA passive copper hybrid cable, ETH 40GbE to 4x10GbE, QSFP to...
  • Page 89 NVIDIA passive copper cable, ETH 40GbE, 40Gb/s, QSFP, 2.5m, black B02AE pull-tab MCP7900- NVIDIA passive copper hybrid cable, ETH 40GbE to 4x10GbE, QSFP to X01AA 4xSFP+, 1.5m, blue pull-tab, customized label MCP7904- NVIDIA passive copper hybrid cable, ETH 40GbE to 4x10GbE, QSFP to...
  • Page 90 Spee Part Number Marketing Description 25Gb MCP2M00- NVIDIA passive copper cable, ETH, up to 25Gb/s, SFP28, 3m, A003E26N black, 26AWG, CA-N 25Gb MCP2M00- NVIDIA passive copper cable, ETH, up to 25Gb/s, SFP28, 3m, A003E30L black, 30AWG, CA-L 25Gb MCP2M00- NVIDIA passive copper cable, ETH, up to 25Gb/s, SFP28, 4m,...
  • Page 91 MMA2P00-AS 150m 10GbE Cables Spee Part Number Marketing Description MAM1Q00A- NVIDIA cable module, ETH 10GbE, 40Gb/s to 10Gb/s, QSFP to SFP+ MC2309124 NVIDIA passive copper hybrid cable, ETH 10GbE, 10Gb/s, QSFP to -005 SFP+, 5m MC2309124 NVIDIA passive copper hybrid cable, ETH 10GbE, 10Gb/s, QSFP to...
  • Page 92 NVIDIA passive copper cable, ETH 10GbE, 10Gb/s, SFP+, 1.5m -0A1 MC3309130 NVIDIA passive copper cable, ETH 10GbE, 10Gb/s, SFP+, 2.5m -0A2 MCP2100- NVIDIA passive copper cable, ETH 10GbE, 10Gb/s, SFP+, 1m, blue X001B pull-tab, connector label MCP2100- NVIDIA passive copper cable, ETH 10GbE, 10Gb/s, SFP+, 2m, blue X002B...
  • Page 93: Release Notes Change Log History

    NVIDIA SFP+ optical module for 10GBASE-SR 1GbE Cables Spee Part Number Marketing Description MC3208011- NVIDIA optical module, ETH 1GbE, 1Gb/s, SFP, LC-LC, SX 850nm, 1GbE up to 500m MC3208411- 1GbE NVIDIA module, ETH 1GbE, 1Gb/s, SFP, Base-T, up to 100m Release Notes Change Log History Changes and New Features in 4.5.2...
  • Page 94 Linux rootfs installation and can be overloaded with device=/dev/mmcblk0 bf.cfg to push together with the BFB. Info Installing on NVMe causes DPU booting to stay at the UEFI shell when changing to Live sh mode. Info NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 95 /tmp OS con guration – enabled tmpfs in Changes and New Features in 3.9.2 Added support for Arm host Enroll new NVIDIA certi cates to DPU UEFI database Warning Important: User action required! See known issue #3077361 for details. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 96 BFB package Added support for rate limiting VF groups Changes and New Features in 3.8.5 PXE boot option is enabled automatically and is available for the ConnectX and OOB network interfaces NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 97: Bug Fixes History

    Changes and New Features in 3.8.0 Added ability to perform warm reboot on BlueField-2 based devices Added support for DPU BMC with OpenBMC Added support for NVIDIA Converged Accelerator (900-21004-0030-000) Bug Fixes History Issue Description Description: Host PCIe driver hangs when hot plugging a device due to SF creation and error ow handling failure.
  • Page 98 Keyword: Thermal; hang Reported in version: 4.5.0 Description: When enrolling a certi cate to the UEFI DB, a failure message ERROR: Unsupported file type! is displayed when the DB was full. Keyword: SNAP; UEFI NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 99 ERR[UEFI]: PC=0xF4B72E0C ERR[UEFI]: PC=0xF4B72E70 ERR[UEFI]: PC=0xF4B73570 ERR[UEFI]: PC=0xF4B74904 ERR[UEFI]: PC=0xF4F04444 ERR[UEFI]: PC=0xF4F044F8 ERR[UEFI]: PC=0xF4F05160 ERR[UEFI]: PC=0xF4F02030 ERR[UEFI]: PC=0xFDFC3A38 (0xFDFB0000+0x13A38) [ 1] DxeCore.dll ERR[UEFI]: PC=0xF56E3594 (0xF56D4000+0xF594) [ 2] BdsDxe.dll ERR[UEFI]: PC=0xF56F1FFC (0xF56D4000+0x1DFFC) [ 2] BdsDxe.dll NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 100 Description: On BlueField-3, when booting virtio-net emulation device using a GRUB2 bootloader, the bootloader may attempt to close and re-open the virtio-net device. This can result in unexpected behavior and possible system failure to boot. Keywords: BlueField-3; virtio-net; UEFI Fixed in version: 4.5.0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 101 " output does not match " " output. Keywords: IPMI; print Fixed in version: 4.2.1 Description: Failure to ssh to Arm via 1GbE OOB interface is experienced after performing warm reboot on the DPU. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 102 Description: OpenSSL is not working with PKA engine on CentOS 7.6 with 4.23 5.4 5.10 kernels due to multiple versions of OpenSSL(1.0.2k and 1.1.1k) are installed. Keywords: OpenSSL; PKA Fixed in version: 4.2.0 Description: 699140280000 OPN is not supported. Keywords: SKU; support NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 103 Keywords: Virtio-net, hotplug Fixed in version: 4.2.0 Description: Assert errors may be observed in the RShim log after reset/reboot. These errors are harmless and may be ignored. Keywords: RShim; log; error Fixed in version: 4.0.3 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 104 The host may treat this as a fatal error and crash. Keywords: RShim; ATF Fixed in version: 3.9.2 Description: Virtio-net-controller recovery may not work for a hot-plugged device because the system assigns a BDF (string identi er) of 0 for the hot-plugged NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 105 Description: Eye-opening is not supported on 25GbE integrated-BMC BlueField-2 DPU. Keywords: Firmware, eye-opening Fixed in version: 3.9.0 Description: Virtio full emulation is not supported by NVIDIA® BlueField®-2 multi- host cards. Keywords: Virtio full emulation; multi-host Fixed in version: 3.9.0 efi_call_rts Description: After BFB installation, Linux crash may occur with messages in the call trace which can be seen from the UART console.
  • Page 106 Fixed in version: 3.8.5 Description: The available RShim logging bu er may not have enough space to hold the whole register dump which may cause bu er wraparound. Keywords: RShim; logging Fixed in version: 3.8.5 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 107 OpenSSL is con gured to use a dynamic engine (e.g. Blue eld PKA engine). Keywords: OpenSSL; curl Fixed in version: 3.8.0 Description: UEFI secure boot enables the kernel lockdown feature which blocks access by mstmcra. Keywords: Secure boot NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 108 Fixed in version: 3.8.0 Description: Various errors related to the UPVS store running out of space are observed. Keywords: UPVS; errors Fixed in version: 3.8.0 oob_net0 Description: cannot receive tra c after a network restart. oob_net0 Keywords: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 109 Fixed in version: 3.7.0 Description: When shared RQ mode is enabled and o oads are disabled, running multiple UDP connections from multiple interfaces can lead to packet drops. Keywords: O oad; shared RQ Fixed in version: 3.7.0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 110 " ". Keywords: strongSwan; ip xfrm; IPsec Fixed in version: 3.7.0 Description: Server crashes after con guring PCI_SWITCH_EMULATION_NUM_PORT to a value higher than the number of PCIe NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 111 Keywords: Boot Fixed in version: 3.6.0.11699 Description: Creating a bond via NetworkManager and restarting the driver (openibd restart) results in no pf0hpf and bond creation failure. Keywords: Bond; LAG; network manager; driver reload NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 112 OpenSSL con guration being linked to use PKA hardware, but that hardware is not available since crypto support is disabled on these platforms. Keywords: PKA; Crypto Fixed in version: 3.6.0.11699 Description: All NVMe emulation counters (Ctrl, SQ, Namespace) return "0" when queried. Keywords: Emulated devices; NVMe NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 113 RShim PF exposed and probed by the RShim driver. Keywords: RShim; multi-host Fixed in version: 3.5.0.11563 Description: When moving to separate mode on the DPU, the OVS bridge remains and no ping is transmitted between the Arm cores and the remote server. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 114 Keywords: SmartNIC; operation modes Fixed in version: 3.5.0.11563 Description: Pushing the BFB image v3.5 with a WinOF-2 version older than 2.60 can cause a crash on the host side. Keywords: Windows; RShim Fixed in version: 3.5.0.11563 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 115: Bluefield Software Overview

    BlueField SW ships with the NVIDIA ® BlueField ® Reference Platform. BlueField SW is a reference Linux distribution based on the Ubuntu Server distribution extended to include the MLNX_OFED stack for Arm and a Linux kernel which supports NVMe-oF.
  • Page 116 Arm AArch64 processors, and ConnectX-6 Dx (for BlueField-2), ConnectX-7 (for BlueField-3), or network controller, each with its own rich software ecosystem. As such, almost any of the programmer-visible software interfaces in BlueField come from existing standard interfaces for the respective components. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 117: System Connections

    The Arm related interfaces (including those related to the boot process, PCIe connectivity, and cryptographic operation acceleration) are standard Linux on Arm interfaces. These interfaces are enabled by drivers and low-level code provided by NVIDIA as part of the BlueField software delivered and upstreamed to respective open-source projects, such as Linux.
  • Page 118: System Consoles

    Virtual RShim console ( on the Arm cores) is driven by The RShim PCIe driver (does not require a cable but the system cannot be in isolation mode as isolation mode disables the PCIe device needed) NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 119: Network Interfaces

    (e.g., for le transfer protocols, SSH, and PXE boot). The OOB port is not a path for the BlueField-2 boot stream (i.e., any attempt to push a BFB to this port will not work). NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 120: Software Installation And Upgrade

    The NVIDIA® BlueField® DPU is shipped with the BlueField software based on Ubuntu 20.04 pre-installed. The DPU's Arm execution environment has the capability of being functionally isolated from the host server and uses a dedicated network management interface (separate from the host server's management interface).
  • Page 121: Deploying Bluefield Software Using Bfb From Host

    Upgrade the rmware on your DPU Firmware Upgrade Uninstall Previous Software from Host If an older DOCA software version is installed on your host, make sure to uninstall it before proceeding with the installation of the new version: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 122 SNAP Controller DMA controller: Mellanox Technologies MT42822 BlueField- 00.3 SoC Management Interface // This is the RShim PF doca-tools RShim is compiled as part of the package in the doca-host-repo-ubuntu<version>_amd64 .deb .rpm le ( NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 123 To install Procedure 1. Download the DOCA Tools host package from the "Installation Files" section in the NVIDIA DOCA Installation Guide for Linux . 2. Unpack the deb repo. Run: host# sudo dpkg -i doca-host-repo- ubuntu<version>_amd64.deb 3. Perform apt update. Run:...
  • Page 124 Procedure 1. Download the DOCA Tools host package from the "Installation Files" section in the NVIDIA DOCA Installation Guide for Linux . 2. Unpack the RPM repo. Run: host# sudo rpm -Uvh doca-host-repo- rhel<version>.x86_64.rpm 3. Enable new dnf repos. Run: CentOS/RHEL 8.x or Rocky 8.6...
  • Page 125 # cat /dev/rshim<N>/misc | grep DEV_NAME DEV_NAME pcie-0000:04:00.2 This output indicates that the RShim service is ready to use. Installing Ubuntu on BlueField Note mt41686_pciconf0 It is important to know your device name (e.g., NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 126 Ubuntu users are prompted to change the default password (ubuntu) for the default user (ubuntu) upon rst login. Logging in will not be possible even if the login prompt appears DPU is ready /dev/rshim0/misc until all services are up (" " message appears in NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 127 2. Add the password hash in quotes to the # vim bf.cfg ubuntu_PASSWORD='$1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1' bf.cfg bfb-install le is used with the script in the steps that follow. Note Password policy: Minimum password length – 8 At least one upper-case letter NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 128 # vim bf.cfg grub_admin_PASSWORD=’ grub.pbkdf2.sha512.10000.5EB1FF92FDD89BDAF3395174282C77430656A6DBEC grub-mkpasswd-pbkdf2 To get a new encrypted password value use the command After the installation, the password can be updated by editing the le /etc/grub.d/40_custom update-grub and then running the command which NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 129 2. Utilize the newly created BFB le, , while following the instructions below. A pre-built BFB of Ubuntu 20.04 with DOCA Runtime and DOCA packages installed is available on the NVIDIA DOCA SDK developer zone page. Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 130 All new BlueField-2 devices are secure boot enabled, hence all SW images must be signed by NVIDIA in order to boot. All formally published SW images are signed. To install Ubuntu BFB, run on the host side: # bfb-install -h syntax: bfb-install --bfb|-b <BFBFILE>...
  • Page 131 INFO[MISC]: Ubuntu installation started INFO[MISC]: Installing OS image INFO[MISC]: Changing the default password for user ubuntu INFO[MISC]: Running bfb_modify_os from bf.cfg INFO[MISC]: ===================== bfb_modify_os ===================== INFO[MISC]: Installation finished INFO[MISC]: Rebooting... Verify BFB is Installed NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 132: Firmware Upgrade

    Make sure all the services (including cloud-init) are started on BlueField and to perform a graceful shutdown before power cycling the host server. /etc/mlnx-release BlueField OS image version is stored under in the DPU. # cat /etc/mlnx-release DOCA_2.6.0_BSP_4.6.0_Ubuntu_22.04-<version> Firmware Upgrade Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 133 2. SSH to your DPU via 192.168.100.2 (precon gured). The default credentials for Ubuntu are as follows. Username Password ubuntu Set during installation For example: ssh ubuntu@192.168.100.2 Password: <unique-password> 3. Upgrade the rmware on the DPU. Run: sudo /opt/mellanox/mlnx-fw-updater/mlnx_fw_updater.pl -- force-fw-update Example output: Device #1: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 134 # sudo mlxconfig -d /dev/mst/<device-id> -y reset   Reset configuration for device /dev/mst/<device-name>? (y/n) [n] : y Applying... Done! -I- Please reboot machine to load new configurations. Note To learn the device ID of the DPUs on your setup, run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 135   BlueField2(rev:1) /dev/mst/mt41686_pciconf0 3b:00.0 mlx5_0 net-ens1f0   BlueField3(rev:1)       /dev/mst/mt41692_pciconf0.1   e2:00.1   mlx5_1          net-ens7f1np1             4   BlueField3(rev:1)       /dev/mst/mt41692_pciconf0     e2:00.0   mlx5_0          net-ens7f0np0             4 The device IDs for the BlueField-2 and BlueField-3 DPUs in this /dev/mst/mt41686_pciconf0 example are NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 136 – called after le the system is extracted on the target partitions. It can be used to modify les or create new les on the target le system mounted /mnt under . So the le path should look as follows: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 137 # cat /root/bf.cfg   bfb_modify_os() log ===================== bfb_modify_os ===================== log "Disable OVS bridges creation upon boot" sed -i -r -e 's/(CREATE_OVS_BRIDGES=).*/\1"no"/' /mnt/etc/mellanox/mlnx-ovs.conf   bfb_pre_install() log ===================== bfb_pre_install =====================   bfb_post_install() log ===================== bfb_post_install ===================== NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 138 SF Index: pci/0000:03:00.0/229408 Parent PCI dev: 0000:03:00.0 Representor netdev: en3f0pf0sf0 Function HWADDR: 02:61:f6:21:32:8c Auxiliary device: mlx5_core.sf.2 netdev: enp3s0f0s0 RDMA dev: mlx5_2   SF Index: pci/0000:03:00.1/294944 Parent PCI dev: 0000:03:00.1 Representor netdev: en3f1pf1sf0 Function HWADDR: 02:30:13:6a:2d:2c NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 139 MAC address that better suit your network needs. 4. Two OVS bridges are created: # ovs-vsctl show f08652a8-92bf-4000-ba0b-7996c772aff6 Bridge ovsbr2 Port ovsbr2 Interface ovsbr2 type: internal Port p1 Interface p1 Port en3f1pf1sf0 Interface en3f1pf1sf0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 140 OVS_BRIDGE1="ovsbr1" OVS_BRIDGE1_PORTS="p0 pf0hpf en3f0pf0sf0" OVS_BRIDGE2="ovsbr2" OVS_BRIDGE2_PORTS="p1 pf1hpf en3f1pf1sf0" OVS_HW_OFFLOAD="yes" OVS_START_TIMEOUT=30 Note /sbin/mlnx_bf_configure If failures occur in con guration changes happen (e.g. switching to separated host mode) OVS bridges are not created even if CREATE_OVS_BRIDGES="yes" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 141 # network: {config: disabled} network: ethernets: tmfifo_net0: addresses: - 192.168.100.2/30 dhcp4: false nameservers: addresses: - 192.168.100.1 routes: metric: 1025 to: 0.0.0.0/0 via: 192.168.100.1 oob_net0: dhcp4: true renderer: NetworkManager version: 2   # cat /etc/netplan/60-mlnx.yaml NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 142 LLv6 address, restart the RShim driver: systemctl restart rshim Ubuntu Boot Time Optimizations To improve the boot time, the following optimizations were made to Ubuntu OS image: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 143 Grub menu (i.e., SHIFT or Esc) does not work. Function key F4 can be used to enter the Grub menu. System Services: docker.service is disabled in the default Ubuntu OS image as it dramatically a ects boot time. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 144 2048 104447 102400 50M EFI System /dev/nvme0n1p2 104448 114550086 114445639 54.6G Linux filesystem /dev/nvme0n1p3 114550087 114652486 102400 50M EFI System /dev/nvme0n1p4 114652487 229098125 114445639 54.6G Linux filesystem /dev/nvme0n1p5 229098126 250069645 20971520 10G Linux filesystem NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 145 – root FS partition for the rst OS image /dev/mmcblk0p3 – boot EFI partition for the second OS image /dev/mmcblk0p4 – root FS partition for the second OS image /dev/mmcblk0p5 – common partition for both OS images NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 146 16GB, dual boot support is disabled by default, but it can be forced by bf.cfg setting the following parameter in FORCE_DUAL_BOOT=yes /common To modify the default size of the partition, add the following parameter: COMMON_SIZE_SECTORS=<number-of-sectors> NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 147 1. Download the new BFB to the BlueField DPU into the partition. Use bfb_tool.py script to install the new BFB on the inactive BlueField DPU partition: /opt/mellanox/mlnx_snap/exec_files/bfb_tool.py --op fw_activate_bfb --bfb <BFB> 2. Reset BlueField DPU to load the new OS image: /sbin/shutdown -r 0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 148 Boot0040* focal0 ..2 Note efibootmgr -o Modifying the boot order with does not remove unused boot options. For example, changing a boot order from 0001,0002, 0003 to just 0001 does not actually remove 0002 and NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 149: Deploying Bluefield Software Using Bfb From Bmc

    Verify that RShim is already running on BMC Ensure RShim is Running on BMC bf.cfg Change the default credentials using Changing Default Credentials Using bf.cfg le (optional) Install the Ubuntu BFB image BFB Installation NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 150 -v Example output: MST modules: ------------ MST PCI module is not loaded MST PCI configuration module loaded PCI devices: ------------ DEVICE_TYPE RDMA NUMA BlueField2(rev:1) /dev/mst/mt41686_pciconf0.1 3b:00.1 mlx5_1 net-ens1f1   BlueField2(rev:1) /dev/mst/mt41686_pciconf0 3b:00.0 mlx5_0 net-ens1f0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 151 Ubuntu users are prompted to change the default password (ubuntu) for the default user (ubuntu) upon rst login. Logging in will not be possible even if the login prompt appears DPU is ready /dev/rshim0/misc until all services are up (" " message appears in NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 152 2. Add the password hash in quotes to the # vim bf.cfg ubuntu_PASSWORD='$1$3B0RIrfX$TlHry93NFUJzg3Nya00rE1' bf.cfg bfb-install le is used with the script in the steps that follow. Note Password policy: Minimum password length – 8 At least one upper-case letter NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 153 BFB installation, refer to section "bf.cfg Parameters". BFB Installation To update the software on the NVIDIA® BlueField® device, the BlueField must be booted up without mounting the eMMC ash device. This requires an external boot ow where a BFB (which includes ATF, UEFI, Arm OS, NIC rmware, and initramfs) is pushed from an external host via USB or PCIe.
  • Page 154 RShim device. This can be done by either running SCP directly or using the Red sh interface. Red sh Interface The following is a simple sequence diagram illustrating the ow of the BFB installation process. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 155 The following are detailed instructions outlining each step in the diagram: 1. Con rm the identity of the remote server (i.e., host holding the BFB image) and BMC. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 156 – the IP address of the server hosting the BFB le remote_server_public_key – remote server's public key from the ssh-keyscan response, which contains both the type and the public <type> <public_key> key with a space between the two elds (i.e., " "). NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 157 "@odata.type": "#Message.v1_1_1.Message", "Message": "The request completed successfully.", "MessageArgs": [], "MessageId": "Base.1.15.0.Success", "MessageSeverity": "OK", "Resolution": "None" 4. If the remote server public key must be revoked, use the following command before repeating the previous step: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 158 "ImageURI":"<image_uri>","Targets": ["redfish/v1/UpdateService/FirmwareInventory/DPU_OS"], "Username":"<username>"}' https://<bmc_ip>/redfish/v1/UpdateService/Actions/UpdateService Info After the BMC boots, it may take a few seconds (6-8 in NVIDIA® DPU_OS BlueField®-2, and 2 in BlueField-3) until the DPU BSP ( is up. Note This command uses SCP for the image transfer, initiates a soft reset on the BlueField and then pushes the boot stream.
  • Page 159  "error": {    "@Message.ExtendedInfo": [        "@odata.type": "#Message.v1_1_1.Message",        "Message": "The requested resource of type Target named '/dev/rshim0/boot' was not found.",        "MessageArgs": [          "Target", NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 160 Username was missing from the request.", "MessageArgs": [ "Username" "MessageId": "Base.1.15.0.CreateFailedMissingReqProperties", "MessageSeverity": "Critical", "Resolution": "Correct the body to include the required property with a valid value and resubmit the request if the operation failed." NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 161 , and a keep-alive message is generated every 5 minutes with the content "Transfer is still in progress (X minutes elapsed). Please wait". Once the transfer is completed, the PercentComplete TaskState is set to 100, and the Completed updated to NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 162 "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>, "/dev/rshim0/boot" "MessageId": "Update.1.0.TransferFailed", "Resolution": " Unknown Host: Please provide server's public key using PublicKeyExchange ", "Severity": "Critical" … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 163 "Resolution": "Unauthorized Client: Please use the PublicKeyExchange action to receive the system's public key and add it as an authorized key on the remote server", "Severity": "Critical" … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 164 "Message": "Transfer of image '<file_name>' to '/dev/rshim0/boot' failed.", "MessageArgs": [ "<file_name>", "/dev/rshim0/boot" "MessageId": "Update.1.0.TransferFailed", "Resolution": "Failed to launch SCP", "Severity": "Critical" … "PercentComplete": 0, "StartTime": "<start_time>", "TaskMonitor": "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Exception", "TaskStatus": "Critical" The keep-alive message: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 165 "/redfish/v1/TaskService/Tasks/<task_id>/Monitor", "TaskState": "Running", "TaskStatus": "OK" Upon completion of transfer of the BFB image to the DPU, the following is received: "@odata.type": "#MessageRegistry.v1_4_1.MessageRegistry", "Message": "Device 'DPU' successfully updated with image '<file_name>'.", "MessageArgs": [ "DPU", "<file_name>" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 166 BlueField Logs" for information on dumping the which contains the current RShim miscellaneous messages. 5. Verify that the new BFB is running by checking its version: curl -k -u root:'<password>' -H "Content-Type: application/json" -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/DPU Direct SCP NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 167 "DPU is ready" indicates that all the relevant services are up and users can login the system. After the installation of the Ubuntu 20.04 BFB, the con guration detailed in the following sections is generated. Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 168 1. Set a temporary static IP on the host. Run: sudo ip addr add 192.168.100.1/24 dev tmfifo_net0 2. SSH to your DPU via 192.168.100.2 (precon gured). The default credentials for Ubuntu are as follows. Username Password ubuntu Set during installation NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 169 Important! To apply NVCon g changes, stop here and follow the steps in section "Updating NVCon g Params". 4. Perform a graceful shutdown and p ower cycle the host for the changes to take e ect. Updating NVCon g Params NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 170 To learn the device ID of the DPUs on your setup, run: mst start mst status -v Example output: MST modules: ------------ MST PCI module is not loaded MST PCI configuration module loaded PCI devices: ------------ DEVICE_TYPE RDMA NUMA BlueField2(rev:1) /dev/mst/mt41686_pciconf0.1 3b:00.1 mlx5_1 net-ens1f1   NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 171 Ethernet, please run the following con guration: sudo mlxconfig -d <device-id> s LINK_TYPE_P1=2 LINK_TYPE_P2=2 mlxconfig 4. Perform a graceful shutdown and power cycle the host for the settings to take e ect. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 172: Deploying Nvidia Converged Accelerator

    Deploying BlueField Software Using BFB from BMC Deploying BlueField Software Using BFB with PXE NVIDIA® CUDA® (GPU driver) must be installed in order to use the GPU. For information on how to install CUDA on your Converged Accelerator, refer to NVIDIA CUDA Installation Guide for Linux.
  • Page 173 MST PCI module is not loaded MST PCI configuration module loaded PCI devices: ------------ DEVICE_TYPE RDMA NUMA BlueField2(rev:1) /dev/mst/mt41686_pciconf0.1 3b:00.1 mlx5_1 net-ens1f1   BlueField2(rev:1) /dev/mst/mt41686_pciconf0 3b:00.0 mlx5_0 net-ens1f0 BlueField-X Mode 1. Run the following command from the host: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 174: Standard Mode

    2. Perform a graceful shutdown and power cycle the host for the con guration to take e ect. Verifying Con gured Operational Mode Use the following command from host or from DPU: $ sudo mlxconfig -d /dev/mst/<device-name> q PCI_DOWNSTREAM_PORT_OWNER[4] Example of Standard mode output: Device #1: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 175 The GPU is no longer visible from the host: root@host:~# lspci | grep -i nv None The GPU is now visible from the DPU: ubuntu@dpu:~$ lspci | grep -i nv 06:00.0 3D controller: NVIDIA Corporation GA20B8 (rev a1) NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 176 -k -H "X-Auth-Token: <token>" rmwar and starts -H "Content-Type: tracking the application/octet-stream" -X POST update secure update progress -T <package_path> https://<bmc_ip>/redfish/v1/Update Service/update Where: bmc_ip – BMC IP address token – session token received when establishing connection NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 177 – session token received when establishing connection task_id – Task ID 4 Reset/re Resets/reboot boot a s the BMC curl -k -H "X-Auth-Token: <token>" -H "Content-Type: application/json" -X POST -d '{"ResetType": "GracefulRestart"}' https://<bmc_ip>/redfish/v1/Manage NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 178 | jq -r ' .Version' Where: bmc_ip – BMC IP address token – session token received when establishing connection For BlueField-2: curl -k -H "X-Auth-Token: <token>" -X GET https://<bmc_ip>/redfish/v1/Update Service/FirmwareInventory Fetch the current rmware ID and then perform: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 179 Fetch Fetches the Service/FirmwareInventory/Bluefiel running running d_FW_ERoT | jq -r ' .Version' rmware rmwar version from Where: version bmc_ip – BMC IP address token – session token received when establishing connection BMC Update Info NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 180 Command #3 is used to verify the task has completed because during the update procedure the reboot option is disabled. When "PercentComplete" reaches 100, command #4 is used to reboot the BMC. For example: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 181 "MessageId": "Base.1.13.0.Success", "MessageSeverity": "OK", "Resolution": "None" Command #5 can be used to verify the current BMC rmware version after reboot: For BlueField-3: curl -k -H "X-Auth-Token: <token>" -X GET https://<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory/BMC | jq -r ' .Version' NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 182 2. Use command #5 with the fetched rmware ID in the previous step: curl -k -H "X-Auth-Token: <token>" -X GET https:/<bmc_ip>/redfish/v1/UpdateService/FirmwareInventory | jq -r ' .Version'   % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 183 "@odata.type": "#Task.v1_4_3.Task", "Id": "0", "TaskState": "Running" Command #3 can be used to track the progress of the CEC rmware update. For example: curl -k -H "X-Auth-Token: <token>" -X GET https://<bmc_ip>/redfish/v1/TaskService/Tasks/0 | jq -r ' .PercentComplete' NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 184 % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 1172 0 --:--:-- --:--:-- - -:--:-- 1172 19-4 CEC Background Update Status Info This section is relevant only for BlueField-3. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 185 "Oem": { "Nvidia": { "@odata.type": "#NvidiaChassis.v1_0_0.NvidiaChassis", "AutomaticBackgroundCopyEnabled": true, "BackgroundCopyStatus": "Completed", "InbandUpdatePolicyEnabled": true … Info InProgress The background update initially indicates while the inactive copy of the image is being updated with the copy. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 186 Check the status of the ongoing rmware update by looking at TaskCollection resource. Red sh task hangs Red sh task URI that previously returned by the Red sh server is no longer accessible NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 187 The Red sh client may retry the rmware update. Firmware image The Red sh task monitoring the rmware update indicates a validation failure failure: TaskState Exception is set to TaskStatus Warning is set to NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 188   root@dpu:~# i2cset -y 3 0x4f 0x5c 0x05 0x08 0x00 0x80 s root@dpu:~# i2cget -y 3 0x4f 0x5c ip 5 5: 0x04 0x05 0x08 0x00 0x5f root@dpu:~# i2cget -y 3 0x4f 0x5d ip 5 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 189 39 32 2e 30 30 2e 36 42 2e 30 30 2e 30 31 00 00 92.00.6B.00.01 Updating GPU Firmware root@dpu:~# scp root@10.23.201.227:/<path-to-fw- bin>/1004_0230_891__92006B0001-dbg-ota.bin /tmp/gpu_images/ root@10.23.201.227's password: 1004_0230_891__92006B0001-dbg-ota.bin 100% 384KB 384.4KB/s 00:01   root@dpu:~# cat /tmp/gpu_images/progress.txt TaskState="Running" TaskStatus="OK" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 190: Installing Repo Package On Host Side

    Ubuntu host# for f in $( dpkg --list | grep doca | awk '{print $2}' ); do echo $f ; apt remove --purge $f -y ; done NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 191 0.0.6.2.5.2003.1.al8.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-bclinux2110sp2-2.5.2- 0.0.6.23.10.3.2.2.0.oe1.bclinux.aarch64.rpm BCLinux 21.10 doca-host-repo-bclinux2110sp2-2.5.2- 0.0.6.23.10.3.2.2.0.oe1.bclinux.x86_64.rpm aarch doca-host-repo-ctyunos20-2.5.2- 0.0.6.23.10.3.2.2.0.ctl2.aarch64.rpm CTyunOS 2.0 doca-host-repo-ctyunos20-2.5.2- 0.0.6.23.10.3.2.2.0.ctl2.x86_64.rpm aarch doca-host-repo-ctyunos2301-2.5.2- 0.0.6.23.10.3.2.2.0.ctl3.aarch64.rpm CTyunOS 23.01 doca-host-repo-ctyunos2301-2.5.2- 0.0.6.2.5.2003.1.ctl3.23.10.3.2.2.0.x86_64.rpm doca-host-repo-debian1013_2.5.2- Debian 10.13 0.0.6.2.5.2003.1.23.10.3.2.2.0_amd64.deb doca-host-repo-debian108_2.5.2- Debian 10.8 0.0.6.2.5.2003.1.23.10.3.2.2.0_amd64.deb doca-host-repo-debian109_2.5.2- Debian 10.9 0.0.6.23.10.3.2.2.0_amd64.deb NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 192 Oracle Linux 8.7 x86 0.0.6.2.5.2003.1.el8.23.10.3.2.2.0.x86_64.rpm Oracle Linux 9.0 x86 doca-host-repo-ol90-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-openeuler2003sp3-2.5.2- 0.0.6.23.10.3.2.2.0.aarch64.rpm openEuler 20.03 doca-host-repo-openeuler2003sp3-2.5.2- 0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-openeuler2203-2.5.2- 0.0.6.23.10.3.2.2.0.aarch64.rpm openEuler 22.03 doca-host-repo-openeuler2203-2.5.2- 0.0.6.23.10.3.2.2.0.x86_64.rpm RHEL/CentOS doca-host-repo-rhel72-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm RHEL/CentOS doca-host-repo-rhel74-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-rhel76-2.5.2- 0.0.6.2.5.2003.1.el7a.23.10.3.2.2.0.aarch64.rpm RHEL/CentOS doca-host-repo-rhel76-2.5.2- 0.0.6.2.5.2003.1.el7.23.10.3.2.2.0.x86_64.rpm RHEL/CentOS doca-host-repo-rhel77-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 193 RHEL/CentOS doca-host-repo-rhel83-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-rhel84-2.5.2-0.0.6.23.10.3.2.2.0.aarch64.rpm RHEL/CentOS doca-host-repo-rhel84-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-rhel85-2.5.2-0.0.6.23.10.3.2.2.0.aarch64.rpm RHEL/CentOS doca-host-repo-rhel85-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-rhel86-2.5.2- 0.0.6.2.5.2003.1.el8.23.10.3.2.2.0.aarch64.rpm RHEL/Rocky 8.6 doca-host-repo-rhel86-2.5.2- 0.0.6.2.5.2003.1.el8.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-rhel88-2.5.2- 0.0.6.2.5.2003.1.el8.23.10.3.2.2.0.aarch64.rpm RHEL/Rocky 8.8 doca-host-repo-rhel88-2.5.2- 0.0.6.2.5.2003.1.el8.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-rhel89-2.5.2-0.0.6.23.10.3.2.2.0.aarch64.rpm RHEL/Rocky 8.9 doca-host-repo-rhel89-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm RHEL/Rocky aarch doca-host-repo-rhel810-2.5.2-0.0.6.23.10.3.2.2.0.aarch64.rpm 8.10 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 194 RHEL/Rocky 9.4 doca-host-repo-rhel94-2.5.2-0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-sles12sp4-2.5.2- 0.0.6.23.10.3.2.2.0.aarch64.rpm SLES 12 SP4 doca-host-repo-sles12sp4-2.5.2- 0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-sles12sp5-2.5.2- 0.0.6.23.10.3.2.2.0.aarch64.rpm SLES 12 SP5 doca-host-repo-sles12sp5-2.5.2- 0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-sles15sp2-2.5.2- 0.0.6.23.10.3.2.2.0.aarch64.rpm SLES 15 SP2 doca-host-repo-sles15sp2-2.5.2- 0.0.6.23.10.3.2.2.0.x86_64.rpm aarch doca-host-repo-sles15sp3-2.5.2- 0.0.6.23.10.3.2.2.0.aarch64.rpm SLES 15 SP3 doca-host-repo-sles15sp3-2.5.2- 0.0.6.23.10.3.2.2.0.x86_64.rpm NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 195 1. Install DOCA local repo package for host: Procedure 1. Download the DOCA SDK and DOCA Runtime packages from Downloading DOCA Runtime Packages section for the host. 2. Unpack the deb repo. Run: host# sudo dpkg -i doca-host-repo- ubuntu<version>_amd64.deb NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 196 DOCA runtime, tools, and SDK. host# sudo yum install -y doca-runtime doca-sdk 1. Open a RedHat account. 1. Log into RedHat website via the developers tab. 2. Create a developer user. 2. Run: host# subscription-manager register --username= NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 197 4. Install the DOCA local repo package for host. Run: host# rpm -Uvh doca-host-repo- rhel<version>.x86_64.rpm host# sudo yum install -y doca-runtime doca-sdk 5. Sign out from your RHEL account. Run: host# subscription-manager remove --all NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 198: Installing Popular Linux Distributions On Bluefield

    Installing Popular Linux Distributions on BlueField Building Your Own BFB Installation Image Users wishing to build their own customized NVIDIA® BlueField® OS image can use the BFB build environment. See this GitHub webpage for more information. Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 199 Boot mode, you must provision the SPI ROM by booting a dedicated bootstream that allows the SPI ROM to be con gured by the MFT running on the BlueField Arm cores. There are multiple ways to access the RedHat installation media from a BlueField device for installation. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 200 Managing Driver Disk NVIDIA provides a number of pre-built driver disks, as well as a documented ow for building one for any particular RedHat version. Normally a driver disk can be placed on removable media (like a CDROM or USB stick) and is auto-detected by the RedHat installer.
  • Page 201 DesignWare eMMC ( Installing O cial CentOS Distributions Contact NVIDIA Enterprise Support for information on the installation of CentOS distributions. BlueField Linux Drivers The following table lists the BlueField drivers which are part of the Linux distribution.
  • Page 202 HCA in the BlueField SoC. fish mlxb BlueField PKA kernel module Performance monitoring counters. The driver provides access to mlxb sysfs available performance modules through the interface. The performance modules in BlueField are present in several NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 203: Updating Dpu Software Packages Using Standard Linux Tools

    This dpu-upgrade procedure enables upgrading DOCA components using standard Linux apt update yum update tools (e.g., ). This process utilizes native package manager repositories to upgrade DPUs without the need for a full installation, and has the following bene ts : NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 204 Action Instructions Remove ntu/ <dpu> $ apt remove --purge mlxbf- mlxbf- Debi bootimages bootimages* -y package <dpu> $ apt update Install the the GPG <dpu> $ apt install gnupg2 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 205 Add GPG key to APT trusted keyring /etc/apt/trusted.gpg.d/GPG-KEY- Mellanox.pub <dpu> $ echo "deb [signed- by=/etc/apt/trusted.gpg.d/GPG-KEY- Add DOCA online repository Mellanox.pub] $DOCA_REPO ./" > /etc/apt/sources.list.d/doca.list <dpu> $ apt update Update index Upgrade UEFI/ATF Run: rmware NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 206 To prevent automatic upgrade, run: <dpu> $ export RUN_FW_UPDATER=no <dpu> $ apt upgrade Upgrade system Apply the new changes, NIC <dpu> $ mlxfwreset -d rmware, and /dev/mst/mt*_pciconf0 -y -l 3 --sync 1 r UEFI/ATF Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 207 NIC rmware upgrade to take e ect. Remove tOS/ <dpu> $ yum -y remove mlxbf-bootimages* mlxbf- <dpu> $ yum remove libreswan openvswitch- bootimages, ipsec Anol librerswan, and is/R openvswitch-ipsec <dpu> $ yum makecache ocky packages NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 208 CentOS 7.6 with 5.4 kernel – https://linux.mellanox.com/public/repo/doca/2.5.2/rhel7 .6/dpu-arm64/ Rocky Linux 8.6 – https://linux.mellanox.com/public/repo/doca/2.5.2/rhel8 .6/dpu-arm64/ echo "[doca] name=DOCA Online Repo baseurl=$DOCA_REPO enabled=1 Add DOCA online repository gpgcheck=0 priority=10 cost=10" > /etc/yum.repos.d/doca.repo /etc/yum.repos.d/doca.repo A le is created under NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 209 <dpu> $ yum install mlnx-fw-updater- signed.aarch64 Note To prevent automatic ashing of the Upgrade BlueField rmware to the NIC, run the following DPU NIC rmware rst: <dpu> $ export RUN_FW_UPDATER=no 00000194-1a08-d1b6-afb7- bb4 ed30003 00000194-1a08-d1b6- afb7-bb4 ed30007 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 210 /dev/mst/mt*_pciconf0 -y -l 3 --sync 1 r Apply the new changes, NIC Note rmware, and UEFI/ATF mlxfwreset is not supported, a graceful shutdown and host power cycle are required for the NIC rmware upgrade to take e ect. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 211: Initial Con Guration

    – network interface con guration /var/lib/cloud/seed/nocloud-net/user-data – default users and commands to run on the rst boot RDMA and ConnectX Driver Initialization openibd.service RDMA and NVIDIA® ConnectX® drivers are loaded upon boot by the Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 212 :KUBE-KUBELET-CANARY - [0:0] COMMIT *filter :INPUT ACCEPT [41:3374] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [32:3672] :DOCKER-USER - [0:0] :KUBE-FIREWALL - [0:0] :KUBE-KUBELET-CANARY - [0:0] :LOGGING - [0:0] :POSTROUTING - [0:0] :PREROUTING - [0:0] -A INPUT -j KUBE-FIREWALL NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 213 --hashlimit-above 60/sec --hashlimit-burst 20 -- hashlimit-mode srcip --hashlimit-name hashlimit_2 --hashlimit- htable-expire 30000 -m comment --comment MD_IPTABLES -j DROP -A INPUT -m mark --mark 0xb -m recent --rcheck --seconds 86400 -- name portscan --mask 255.255.255.255 --rsource -m comment -- NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 214 -A INPUT -p tcp -m tcp --dport 179 -m mark --mark 0xb -m conntrack --ctstate NEW,ESTABLISHED -m comment --comment MD_IPTABLES -j ACCEPT -A INPUT -p udp -m udp --dport 68 -m mark --mark 0xb -m conntrack --ctstate NEW,ESTABLISHED -m comment --comment MD_IPTABLES -j NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 215 -A INPUT -p tcp -m tcp --sport 53 -m mark --mark 0xb -m conntrack --ctstate NEW,ESTABLISHED -m comment --comment MD_IPTABLES -j ACCEPT -A INPUT -p udp -m udp --dport 500 -m mark --mark 0xb -m conntrack --ctstate NEW,ESTABLISHED -m comment --comment NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 216 -A INPUT -p udp -m udp --dport 514 -m mark --mark 0xb -m conntrack --ctstate NEW,ESTABLISHED -m comment --comment MD_IPTABLES -j ACCEPT -A INPUT -p udp -m udp --dport 67 -m mark --mark 0xb -m conntrack --ctstate NEW,ESTABLISHED -m comment --comment MD_IPTABLES -j NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 217 -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING -A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000 -A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000 -A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN -A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 218: Host-Side Interface Con Guration

    Host-side Interface Con guration The NVIDIA® BlueField® DPU registers on the host OS a "DMA controller" for DPU management over PCIe. This can be veri ed by running the following: #  lspci -d 15b3: | grep 'SoC Management Interface'...
  • Page 219 … systemd[1]: Started rshim driver for BlueField SoC. May 31 14:57:07 … rshim[90323]: Probing pcie-0000:a3:00.2(vfio) May 31 14:57:07 … rshim[90323]: Create rshim pcie-0000:a3:00.2 May 31 14:57:07 … rshim[90323]: rshim pcie-0000:a3:00.2 enable May 31 14:57:08 … rshim[90323]: rshim0 attached NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 220 Multiple DPUs may connect to the same host machine. When the RShim driver is loaded and operating correctly, each board is expected to have its own device directory on sysfs, /dev/rshim<N> tmfifo_net<N> , and a virtual Ethernet device, NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 221 This example deals with two BlueField DPUs installed on the same server (the process is similar for more DPUs). This example assumes that the RShim package has been installed on the host server. Con guring Management Interface on Host Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 222 BOOTPROTO="static" IPADDR="192.168.100.1" NETMASK="255.255.255.0" ONBOOT="yes" TYPE="Bridge" tmfifo_net0 3. Create a con guration le for the rst BlueField DPU, . Run: vim /etc/sysconfig/network-scripts/ifcfg-tmfifo_net0 ifcfg-tmfifo_net0 4. Inside , insert the following content: DEVICE=tmfifo_net0 BOOTPROTO=none ONBOOT=yes NM_CONTROLLED=no BRIDGE=br_tmfifo NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 223 BlueField DPUs arrive with the following factory default con gurations for tm fo_net0. Address Value 00:1a:ca:ff:ff:01 192.168.100.2 Therefore, if you are working with more than one DPU, you must change the default MAC and IP addresses. Updating RShim Network MAC Address NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 224 3. Inside , insert the new MAC: NET_RSHIM_MAC=00:1a:ca:ff:ff:03 4. Apply the new MAC address. Run: sudo bfcfg 5. Repeat this procedure for the second BlueField DPU (using a di erent MAC address). Info NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 225 IP address: sudo vim /etc/netplan/50-cloud-init.yaml   tmfifo_net0: addresses: - 192.168.100.2/30 ===>>> 192.168.100.3/30 2. Reboot the Arm. Run: sudo reboot 3. Repeat this procedure for the second BlueField DPU (using a di erent IP address). NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 226 # vim /etc/sysconfig/network-scripts/ifcfg-tmfifo_net0 IPADDR 2. Modify the value for IPADDR=192.168.100.3 3. Reboot the Arm. Run: reboot netplan apply Or perform 4. Repeat this procedure for the second BlueField DPU (using a di erent IP address). Info NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 227 1. Log into Linux from the Arm console. 2. Run: $ "ls /sys/firmware/efi/efivars". 3. If not mounted, run: $ mount -t efivarfs none /sys/firmware/efi/efivars $ chattr -i /sys/firmware/efi/efivars/RshimMacAddr-8be4df61- 93ca-11d2-aa0d-00e098032b8c $ printf "\x07\x00\x00\x00\x00\x1a\xca\xff\xff\x03" > \ NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 228 0 (0:normal, 1:drop) SW_RESET 0 (1: reset) DEV_NAME pcie-0000:04:00.2 DEV_INFO BlueField-2(Rev 1) PEER_MAC 00:1a:ca:ff:ff:01 (rw) PXE_ID 0x00000000 (rw) VLAN_ID 0 0 (rw) 3. Modify the MAC address. Run: $ echo "PEER_MAC xx:xx:xx:xx:xx:xx" > /dev/rshim0/misc NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 229 It is not recommended to recon gure the MAC address from the MAC con gured during manufacturing. If there is a need to re-con gure this MAC for any reason, follow these steps to con gure a UEFI variable to hold new value for OOB MAC.: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 230 To revert this change and go back to using the MAC as programmed during manufacturing, follow these steps: 1. Log into UEFI from the Arm console, go to "Boot Manager" then "EFI Internal Shell". NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 231 7. Recon gure the original MAC address burned by the manufacturer in the format aa\bb\cc\dd\ee\ff . Run: printf "\x07\x00\x00\x00\x00\<original-MAC-address>" > /sys/firmware/efi/efivars/OobMacAddr-8be4df61-93ca-11d2-aa0d- 00e098032b8c 8. Reboot the device for the change to take e ect. Supported ethtool Options for OOB Interface NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 232 – restart auto-negotiation on the speci ed Ethernet device if auto-negotiation is enabled For example: $ ethtool oob_net0 Settings for oob_net0: Supported ports: [ TP ] Supported link modes: 1000baseT/Full Supported pause frame use: Symmetric Supports auto-negotiation: Yes NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 233 MLNXBF17:00 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no # Display statistics specific to BlueField-2 design (i.e. statistics that are not shown in the output of "ifconfig oob0_net") NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 234 The les that control IP interface con guration are slightly di erent for CentOS and Ubuntu: CentOS con guration of IP interface: oob_net0 Con guration le for /etc/sysconfig/network-scripts/ifcfg-oob_net0 For example, use the following to enable DHCP: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 235: Secure Boot

    For Ubuntu con guration of IP interface, refer to section "Default Network Interface Con guration". Secure Boot These pages provide guidelines on how to operate secured NVIDIA® BlueField® DPUs. They provide UEFI secure boot references for the UEFI portion of the secure boot process. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 236 There should be no way to interrupt or bypass the RoT with runtime changes. Supported BlueField DPUs Secured BlueField devices have pre-installed software and rmware signed with NVIDIA signing keys. The on-chip public key hash is programmed into E-FUSEs. To verify whether the DPU in your possession supports secure boot, run the following...
  • Page 237: Uefi Secure Boot

    UEFI Secure Boot Note This feature is available in the NVIDIA® BlueField®-2 and above. UEFI Secure Boot is a feature of the Uni ed Extensible Firmware Interface (UEFI) speci cation. The feature de nes a new interface between the operating system and rmware/BIOS.
  • Page 238 For that reason, BlueField secured platforms are shipped with all the needed certi cates and signed binaries (which allows working seamlessly with the rst use case in the table above). NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 239 NVIDIA strongly recommends utilizing UEFI secure boot in any case due the increased security it enables. Verifying UEFI Secure Boot on DPU To verify whether UEFI secure boot is enabled, run the following command from the BlueField console: ubuntu@localhost:~$ sudo mokutil --sb-state...
  • Page 240 Disabling secure boot permanently is not recommended in production environments. Info It is also possible to disable UEFI secure boot using Red sh API for DPUs with an on-board BMC. For more details, please refer to your NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 241 Deployment Guide . Existing DPU Certi cates As part of having UEFI secure boot enabled, the UEFI databases are populated with NVIDIA self-signed X.509 certi cates. The Microsoft certi cate is also installed into the UEFI database to ensure that the Ubuntu distribution can boot while UEFI secure boot is enabled (and generally any suitable OS loader signed by Microsoft).
  • Page 242 Secure booting binaries for executing a UEFI application, UEFI driver, OS loader, custom kernel, or loading a custom module depends on the certi cates and public keys available in the UEFI database and the shim's MOK list. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 243 NVIDIA Kernel Modules In this option, the NVIDIA db certi cates should remain enrolled. This is due to the out-of- tree kernel modules and drivers (e.g., OFED) provided by NVIDIA which are signed by NVIDIA and authenticated by this NVIDIA certi cate in the UEFI.
  • Page 244 = "OpenSSL Generated Certificate" To enroll the MOK key certi cate, download the associated key certi cate to the BlueField le system and run the following command: ubuntu@localhost:~$ sudo mokutil --import mok.der NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 245 "MokManager". You may ignore the blue screen showing the error message. Press "OK" to enter the "Shim UEFI key management" screen. Select "Enroll MOK" and follow the menus to nish the enrolling process. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 246 UEFI or, more easily, via shim. Creating a certi cate and public key for use in the UEFI secure boot is relatively simple. OpenSSL can do it by running the command NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 247 X.509 key certi cate in DER format must be enrolled within the UEFI db. A prerequisite for the following steps is having UEFI secure boot temporarily disabled on the DPU. After temporarily disabling UEFI secure boot per device as in section "Existing NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 248 The resulting capsule le, , can be downloaded to the BlueField le system to initiate the key enrollment process. From the the BlueField console execute the following command then reboot: ubuntu@localhost:~$ bfrec --capsule EnrollYourKeysCap NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 249 From the "UEFI menu", select "Device Manager" entry then "Secure Boot Con guration". Navigate to "Secure Boot Mode" and select "Custom Mode" setup. The secure boot "Custom Mode" setup feature allows a physically present user to modify the UEFI database. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 250 To enroll your DER certi cate le, select "DB Options" and enter the "Enroll Signature" menu. Select "Enroll Signature Using File" and navigate within the EFI System Partition (ESP) to the db DER certi cate le. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 251 The GUID is the platform's way of identifying the key. It serves no purpose other than for you to tell which key is which when you delete it (it is not used at all in signature veri cation). NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 252 If the X.509 key certi cate is enrolled in UEFI db or by way of shim, the binary should be loaded without an issue. Signing Kernel Modules The X.509 certi cate you added must be visible to the kernel. To verify the keys visible to the kernel, run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 253 If the X.509 certi cate attributes ( , etc.) are con gured properly, you should see your key certi cate information in the result output. In this example, two custom keys are visible to the kernel: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 254 The signature is appended to the kernel module by But if you rather keep the original kernel module unchanged, run: ubuntu@localhost:~$ kmodsign sha512 mok.priv mok.der module.ko module-signed.ko kmosign --help Refer to for more information. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 255 This requires UEFI secure boot to have been enabled using your own keys, which means that you own the signing keys. While UEFI secure boot is enabled, it is possible to update your keys using a capsule le. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 256 It is possible to disable UEFI secure boot through a capsule update. This requires an empty PK key when creating the capsule le. To create a capsule intended to disable UEFI secure boot: 1. Create a dummy empty PK certi cate: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 257: Updating Platform Firmware

    Deleting the PK certi cate will result in UEFI secure boot to be disabled which is not recommended in a production environment. Updating Platform Firmware To update the platform rmware on secured devices, download the latest NVIDIA® BlueField® software images from NVIDIA.com. Updating eMMC Boot Partitions Image...
  • Page 258 The capsule le is signed with NVIDIA keys. If UEFI secure boot is enabled, make sure the NVIDIA certi cate les are enrolled into the UEFI database. Please refer to "UEFI Secure Boot" for more information on how to update the UEFI database key certi cates.
  • Page 259 From the BlueField console, using the following command: ubuntu@localhost:~$ /opt/mellanox/mlnx-fw- updater/firmware/mlxfwmanager_sriov_dis_aarch64_<bf-dev> From the PCIe host console, using the following command: # flint -d /dev/mst/mt<bf-dev>_pciconf0 -i firmware.bin b Info bf-dev is 41686 for BlueField-2 or 41692 for BlueField-3. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 260: Management

    SoC Management Interface BlueField OOB Ethernet Interface Performance Monitoring Counters The performance modules in NVIDIA® BlueField® are present in several hardware blocks and each block has a certain set of supported events. mlx_pmc driver provides access to all of these performance modules through a /sys/class/hwmon sysfs interface.
  • Page 261 For blocks that use hardware counters to collect data, each counter present in the block is event<N> counter<N> represented by " " and " " sysfs les. For example: ubuntu@bf:/$ ls /sys/class/hwmon/hwmon0/tile0/ counter0 counter1 counter2 counter3 event0 event1 event2 event3 event_list NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 262: Reading Registers

    Hex Value Name Description AW_REQ Reserved for internal use AW_BEATS Reserved for internal use AW_TRANS Reserved for internal use AW_RESP Reserved for internal use AW_STL Reserved for internal use AW_LAT Reserved for internal use NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 263 MAF_REQUEST 0x49 Reserved for internal use RNF_REQUEST Number of REQs sent by the RN-F selected by HNF_PERF_CTL 0x4a register RNF_SEL eld REQUEST_TYP 0x4b Reserved for internal use MEMORY_REA 0x4c Number of reads to MSS NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 264 Accesses requests (Reads, Writes) from DMA IO devices 0x5f TSO_WRITE Total Store Order write Requests from DMA IO devices TSO_CONFLIC 0x60 Reserved for internal use 0x61 DIR_HIT Requests that hit in directory 0x62 HNF_ACCEPTS Reserved for internal use NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 265 Write requests from A72 clusters 0x72 A72_Read Read requests from A72 clusters 0x73 IO_WRITE Write requests from DMA IO devices 0x74 IO_Reads Read requests from DMA IO devices 0x75 TSO_Reject Reserved for internal use NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 266 PCIe access TRIO_MAP_WRQ_BUF_ 0xaa PCIe write transaction bu er is empty EMPTY TRIO_MAP_CPL_BUF_ 0xab Arm PIO request completion queue is empty EMPTY TRIO_MAP_RDQ0_BUF 0xac The bu er of MAC0's read transaction is empty _EMPTY NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 267 CDN is used for control data. The NDN is used for responses. The DDN is for the actual data transfer. Name Description Value 0x00 DISABLE Reserved for internal use 0x01 CYCLES Timestamp counter Read Transaction control request from the CDN of 0x02 TOTAL_RD_REQ_IN the SkyMesh NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 268 Total EMEM Read Request Bank 0 ANK0 TOTAL_EMEM_RD_REQ_B 0x11 Total EMEM Read Request Bank 1 ANK1 TOTAL_EMEM_WR_REQ_B 0x12 Total EMEM Write Request Bank 0 ANK0 TOTAL_EMEM_WR_REQ_B 0x13 Total EMEM Write Request Bank 1 ANK1 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 269 Reserved for internal use 0x27 TRB_REJECT_BANK1 Reserved for internal use 0x28 TAG_REJECT_BANK0 Reserved for internal use 0x29 TAG_REJECT_BANK1 Reserved for internal use 0x2a ANY_REJECT_BANK0 Reserved for internal use 0x2b ANY_REJECT_BANK1 Reserved for internal use PCIe TLR Statistics NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 270 Number of cycles that east input port FIFO runs out of 0x17 OF_CRED credits in the CDN network CDN_DIAG_W_OUT_ Number of cycles that west input port FIFO runs out of 0x18 OF_CRED credits in the CDN network NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 271 Number of cycles that east input port FIFO runs out of 0x27 OF_CRED credits in the DDN network DDN_DIAG_W_OUT_ Number of cycles that west input port FIFO runs out of 0x28 OF_CRED credits in the DDN network NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 272 Number of cycles that east input port FIFO runs out of 0x37 OF_CRED credits in the NDN network NDN_DIAG_W_OUT_ Number of cycles that west input port FIFO runs out of 0x38 OF_CRED credits in the NDN network NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 273 To program a counter to monitor one of the events from the event list, the event name or number needs to be written to the corresponding event le. /sys/class/hwmon/hwmon<N> Let us call the folder corresponding to this driver as BFPERF_DIR NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 274 This resets the accumulator and the counter continues monitoring the same event that has previously been programmed, but starts the count from 0 again. Writing non-zero values to the counter les is not allowed. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 275: Intelligent Platform Management Interface

    Intelligent Platform Management Interface BMC Retrieving Data from BlueField via IPMB NVIDIA® BlueField® DPU® software will respond to Intelligent Platform Management Bus (IPMB) commands sent from the BMC via its Arm I C bus. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 276 Support monitoring of DDR0 temp (on memory controller 1) ddr1_1_temp Support monitoring of DDR1 temp (on memory controller 1) p0_temp Port 0 temperature p1_temp Port 1 temperature p0_link Port0 link status p1_link Port1 link status Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 277 FRU data before the next FRU update. update_timer is a hexadecimal number. NVIDIA® ConnectX® rmware information, Arm rmware version, and MLNX_OFED version. fw_info is in ASCII format. NIC vendor ID, device ID, subsystem vendor ID, and subsystem device ID.
  • Page 278 This FRU le can be used to write the BMC port 0 and port 1 IP addresses to the BlueField. It is empty to begin with. ipmitool fru write 11 <file> The le passed through the " " command must have the following format: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 279: Supported Ipmi Commands

    Network interface 1 information. Updated once every minute. BlueField UID List of ConnectX interface hardware counters Note On BlueField-2 based boards, DDR sensors and FRUs are not supported. They will appear as no reading. Supported IPMI Commands NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 280 Get device sdr info 35.2 SDR info Get device "sdr get", "sdr list" or 35.3 "sdr elist" Get sensor sdr get <sensor-id> 35.7 hysteresis Set sensor sensor thresh <sensor-id> <threshold> <setting> 35.8 threshold NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 281 <sensor-id> upper <unc> <ucr> <unr> Note The upper non-recoverable <unr> option is not supported Get sensor sdr get <sensor-id> 35.9 threshold Get sensor sdr get <sensor-id> 35.11 event enable Get sensor sensor reading <sensor-id> 35.14 reading NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 282 The following steps are performed from the BlueField CentOS prompt. The BlueField is running CentOS 7.6 with kernel 5.4. The CentOS installation was done using the CentOS everything ISO image. The following drivers need to be loaded on the BlueField running CentOS: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 283 $ yum remove -y kmod-i2c-mlx $ modprobe -rv i2c-mlx 2. Transfer the i2c-mlx RPM from the BlueField software tarball under distro/SRPM onto the Arm. Run: $ rpmbuild --rebuild /root/i2c-mlx-1.0- 0.g422740c.src.rpm $ yum install -y /root/rpmbuild/RPMS/aarch64/i2c-mlx- 1.0-0.g422740c_5.4.17_mlnx.9.ga0bea68.aarch64.rpm NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 284 4. The i2c-tools package is also required, but the version contained in the CentOS Yum repository is old and does not work with BlueField. Therefore, please download i2c- tools version 4.1, and then build and install it. # Build i2c-tools from a newer source wget http://mirrors.edge.kernel.org/pub/software/utils/i2c- tools/i2c-tools-4.1.tar.gz NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 285 You may obtain this rpm le by means of scp from the server host's Blue eld Distribution folder. For example: $ scp <BF_INST_DIR>/distro/SRPMS/mlx- OpenIPMI-2.0.25-0.g4fdc53d.src.rpm <ip- address>:/<target_directory>/ If there are issues with building the OpenIPMI RPM, verify that the swig package is not installed. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 286 $ yum install -y /root/rpmbuild/RPMS/aarch64/ipmb-dev-int- 1.0-0.g304ea0c_5.4.0_49.el7a.aarch64.aarch64.rpm $ yum install -y /root/rpmbuild/RPMS/aarch64/ipmb-host-1.0- 0.g304ea0c_5.4.0_49.el7a.aarch64.aarch64.rpm 9. Load the IPMB driver. Run: $ modprobe ipmb-dev- 10. Install and start rasdaemon package. Run: yum install rasdaemon systemctl enable rasdaemon NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 287 It is possible for the external host to retrieve IPMI data via the OOB interface or the ConnectX interfaces. To do that, set the network interface address properly in progconf. For example, if the /etc/ipmi/progconf OOB ip address is 192.168.101.2, edit the OOB_IP variable in the NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 288 BlueField. So the BlueField needs to load the ipmb_host driver when the BMC is up. If the BMC is not up, ipmb_host will fail to load because it has to execute a handshake with the other end before loading. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 289 The set_emu_param.service script will try to load the driver again. BlueField and BMC I2C Addresses on BlueField Reference Platform BlueField in Responder Mode Device C Address BlueField ipmb_dev_int 0x30 BMC ipmb_host 0x20 BlueField in Requester Mode NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 290: Logging

    RShim logging uses an internal 1KB HW bu er to track booting progress and record important messages. It is written by the NVIDIA ® BlueField ® networking platform's (DPU or SuperNIC) Arm cores and is displayed by the RShim driver from the USB/PCIe host machine.
  • Page 291 INFO[BL2]: no DDR on MSS0 INFO[BL2]: calc DDR freq (clk_ref 53836948) INFO[BL2]: DDR POST passed INFO[BL2]: UEFI loaded INFO[BL31]: start INFO[BL31]: runtime INFO[UEFI]: eMMC init INFO[UEFI]: eMMC probed INFO[UEFI]: PCIe enum start INFO[UEFI]: PCIe enum end Info NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 292 If still fails, contact NVIDIA Support. ERR[BL2]: DDR DDR BIST failed in the zero-memory Power-cycle and retry. If the BIST Zero Mem operation problem persists, contact your NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 293 FAE if needed hung. daif … PANIC(BL31): PC = xxx Panic in BL31 with register dump. Capture the log, analyze the cause, cptr_el3 System hung. and report to FAE if needed daif … NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 294 Failed to load certi cation update record ERR[BL2]: IROT Info Contact your NVIDIA FAE with this cert sig not Only relevant for information found certain BlueField- 3 devices. INFO[BL31]: PSC enters turtle mode Informational PSC Turtle NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 295 ERR[BL31]: ATX Contact your NVIDIA FAE with this ATX power is not connected power not information detected! INFO[BL31]: Unable to detect the OPN on this Contact your NVIDIA FAE with this PTMERROR: device information NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 296 Contact your NVIDIA FAE with this Double-bit ECC also detected in the same bu er information same buffer Contact your NVIDIA FAE with this ERR[BL31]: l3c: L3c double-bit ECC error detected double-bit ecc information NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 297 Event message format revision which provides the version of the EvM Rev standard a record is using. This value is 0x04 for all records generated by UEFI. Sensor Sensor type code for sensor that generated the event Type NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 298 BlueField UEFI implements a subset of the IPMI 2.0 SEL standard. Each eld may have the following values: Possible Field Description of Values Values Record Standard SEL record. All events sent by UEFI are standard SEL 0x02 Type records. Event Dir All events sent by UEFI are assertion events NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 299 Code System 0x0F System rmware error (POST error). Firmware Event Data 2: Progress 0x00 0x06 – Unrecoverable EMMC error. Contact NVIDIA support. 0x02 System rmware progress: Informational message, no actions needed. Event Data 2: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 300 $ ipmitool sel get 0x7d SEL Record ID : 007d Record Type : 02 Timestamp : 01/09/1970 00:07:34 Generator ID : 0001 EvM Revision : 04 Sensor Type : System Firmwares Sensor Number : 06 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 301 203a5d49 0a0d0a0d ERR[UEFI]: ..2.672284] [Hardware Error]: 00000010: 636e7953 6e6f7268 2073756f 65637845 Synchronous Exce 2.680987] [Hardware Error]: 00000020: 6f697470 7461206e 36783020 37313643 ption at 0x6C617 2.689696] [Hardware Error]: 00000030: 34 37 30 0d 0a NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 302: Soc Management Interface

    Delay in seconds for RShim over PCIe, which is added after chip reset and before pushing the boot stream. PCIE_INTR_POL Interrupt polling interval in seconds when running RShim over L_INTERVAL direct memory mapping. Setting this parameter to 0 disallows RShim memory PCIE_HAS_VFIO NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 303 # Uncomment the line to configure the ignored devices. 'none' #none         usb- #none         pcie-lf- 0000 00.0 Note If any of these con gurations are changed, then the SoC management interface must be restarted by running: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 304 Host-side Interface Con guration The NVIDIA® BlueField® DPU registers on the host OS a "DMA controller" for DPU management over PCIe. This can be veri ed by running the following: #  lspci -d 15b3: | grep 'SoC Management Interface' 27:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2...
  • Page 305 Con guration procedures vary for di erent OSs. tmfifo_net0 The following example con gures the host side of with a static IP and enables IPv4-based communication to the DPU OS: #  ip addr add dev tmfifo_net0 192.168.100.1/30 Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 306 Using a bridge, with all interfaces on the bridge – the bridge device bares a single IP address on the host while each DPU has unique IP in the same subnet as the bridge NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 307 DPUs). The example assumes that the RShim package has been installed on the host server. Con guring Management Interface on Host Note This example is relevant for CentOS/RHEL operating systems only. bf_tmfifo /etc/sysconfig/network-scripts 1. Create a interface under . Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 308 . Run: vim /etc/sysconfig/network-scripts/ifcfg-tmfifo_net0 ifcfg-tmfifo_net0 4. Inside , insert the following content: DEVICE=tmfifo_net0 BOOTPROTO=none ONBOOT=yes NM_CONTROLLED=no BRIDGE=br_tmfifo tmfifo_net1 5. Create a con guration le for the second BlueField DPU, . Run: DEVICE=tmfifo_net1 BOOTPROTO=none ONBOOT=yes NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 309 BlueField DPUs arrive with the following factory default con gurations for tm fo_net0. Address Value 00:1a:ca:ff:ff:01 192.168.100.2 Therefore, if you are working with more than one DPU, you must change the default MAC and IP addresses. Updating RShim Network MAC Address Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 310 5. Repeat this procedure for the second BlueField DPU (using a di erent MAC address). Info Arm must be rebooted for this con guration to take e ect. It is recommended to update the IP address before you do that to NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 311 IP address: sudo vim /etc/netplan/50-cloud-init.yaml   tmfifo_net0: addresses: - 192.168.100.2/30 ===>>> 192.168.100.3/30 2. Reboot the Arm. Run: sudo reboot 3. Repeat this procedure for the second BlueField DPU (using a di erent IP address). Info NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 312 4. Repeat this procedure for the second BlueField DPU (using a di erent IP address). Info Arm must be rebooted for this con guration to take e ect. It is recommended to update the MAC address before you do that to avoid unnecessary reboots. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 313 $ printf "\x07\x00\x00\x00\x00\x1a\xca\xff\xff\x03" > \ /sys/firmware/efi/efivars/RshimMacAddr-8be4df61-93ca-11d2- aa0d-00e098032b8c printf 00:1a:ca:ff:ff:03 command sets the MAC address to (the last six printf bytes of the value). Either reboot the device or reload the tm fo driver for the NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 314 For more information and an example of the script that covers multiple DPU installation and con guration, refer to section "Installing Full DOCA Image on Multiple DPUs" of the NVIDIA DOCA Installation Guide . NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 315 Refer to section "SoC :caff:feff:ff0 the DPU Management Interface Driver Support for 1%tmfifo_net<N Multiple DPUs" for more information. > 5 PXE boot Please refer to section "Deploying BlueField over RShim Software Using BFB with PXE" for more NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 316: Bluefield Oob Ethernet Interface

    BlueField boot stream. Any attempt to push a BFB to this port would not work. Refer to "How to use the UEFI boot menu" for more information about UEFI operations related to the OOB interface. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 317 1. Log into Linux from the Arm console. ls /sys/firmware/efi/efivars 2. Issue the command to show whether e varfs is mounted. If it is not mounted, run: mount -t efivarfs none /sys/firmware/efi/efivars 3. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 318 4. Log into Linux from the Arm console. ls /sys/firmware/efi/efivars 5. Issue the command to show whether e varfs is mounted. If it is not mounted, run: mount -t efivarfs none /sys/firmware/efi/efivars 6. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 319 To use the ethtool options available, use the following format: $ ethtool [<option>] <interface> <option> Where may be: <no-argument> – display interface link information – display driver general information – display driver statistics – dump driver register set – display driver ring information NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 320 Link partner advertised pause frame use: Symmetric Link partner advertised auto-negotiation: Yes Link partner advertised FEC modes: Not reported Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 3 Transceiver: internal Auto-negotiation: on MDI-X: Unknown Link detected: yes NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 321 $ ethtool -S oob_net0 NIC statistics: hw_access_errors: 0 tx_invalid_checksums: 0 tx_small_frames: 1 tx_index_errors: 0 sw_config_errors: 0 sw_access_errors: 0 rx_truncate_errors: 0 rx_mac_errors: 0 rx_din_dropped_pkts: 0 tx_fifo_full: 0 rx_filter_passed_pkts: 5549 rx_filter_discard_pkts: 4 IP Address Con guration for OOB Interface NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 322 Con guration le for /etc/sysconfig/network-scripts/ifcfg-oob_net0 For example, use the following to enable DHCP: NAME="oob_net0" DEVICE="oob_net0" NM_CONTROLLED="yes" PEERDNS="yes" ONBOOT="yes" BOOTPROTO="dhcp" TYPE=Ethernet For example, to con gure static IP use the following: NAME="oob_net0" DEVICE="oob_net0" IPV6INIT="no" NM_CONTROLLED="no" PEERDNS="yes" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 323 ONBOOT="yes" BOOTPROTO="static" IPADDR="192.168.200.2" PREFIX=30 GATEWAY="192.168.200.1" DNS1="192.168.200.1" TYPE=Ethernet For Ubuntu con guration of IP interface, please refer to section "Default Network Interface Con guration". NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 324: Dpu Operation

    DPU Operation The NVIDIA® BlueField® DPU® family delivers the exibility to accelerate a range of applications while leveraging ConnectX-based network controllers hardware-based o oads with unmatched scalability, performance, and e ciency. Functional Diagram Modes of Operation Kernel Representors Model Multi-Host...
  • Page 325: Functional Diagram

    Shared RQ Mode RegEx Acceleration Functional Diagram The following is a functional diagram of the NVIDIA® BlueField®-2 DPU. For each BlueField DPU network port, there are 2 physical PCIe networking functions exposed: To the embedded Arm subsystem To the host over PCIe Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 326: Modes Of Operation

    The mlx5 drivers and their corresponding software stacks must be loaded on both hosts (Arm and the host server). The OS running on each one of the hosts would probe the drivers. BlueField-2 network interfaces are compatible with NVIDIA® ConnectX®-6 and higher. BlueField-3 network interfaces are compatible with ConnectX-7 and higher.
  • Page 327 DPU entirely through the Arm cores and/or BMC connection instead of through the host. For security and isolation purposes, it is possible to restrict the host from performing operations that can compromise the DPU. The following operations can be restricted NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 328 $ sudo mlxprivhost -d /dev/mst/<device> r --disable_rshim -- disable_tracer --disable_counter_rd --disable_port_owner Note Graceful shutdown and p ower cycle are required if any --disable_* ags are used. Disabling Zero-trust Mode To disable host restriction, set the mode to privileged. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 329 Note The following instructions presume the DPU to operate in DPU mode. If the DPU is operating in zero-trust mode, please return to DPU mode before continuing. NIC Mode for BlueField-3 Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 330 -d /dev/mst/mt41692_pciconf0 s sudo INTERNAL_CPU_MODEL=1 INTERNAL_CPU_OFFLOAD_ENGINE=1 2. Perform a graceful shutdown and power cycle the host. Disabling NIC Mode from Linux To return to DPU mode from NIC mode: 1. Run the following on the host: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 331 3. Select "Network Device List". 4. Select the network device that presents the uplink (i.e., select the device with the uplink MAC address). 5. Select "NVIDIA Network adapter - $<uplink-mac>". 6. Select "BlueField Internal Cpu Con guration". NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 332 # bfb-install --bfb <BlueField-BSP>.bfb --rshim rshim0 NIC Mode for BlueField-2 In this mode, the ECPFs on the Arm side are not functional but the user is still able to mlxconfig access the Arm system and update options. Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 333 To restrict RShim PF (optional), make sure to con gure INTERNAL_CPU_RSHIM=1 mlxconfig as part of the command. 2. Perform a graceful shutdown and power cycle the host. Note Multi-host is not supported when the DPU is operating in NIC mode. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 334 $ mlxconfig -d /dev/mst/<device> s INTERNAL_CPU_MODEL=1 \ INTERNAL_CPU_PAGE_SUPPLIER=0 \ INTERNAL_CPU_ESWITCH_MANAGER=0 \ INTERNAL_CPU_IB_VPORT0=0 \ INTERNAL_CPU_OFFLOAD_ENGINE=0 Note INTERNAL_CPU_RSHIM=1 , then make sure to con gure INTERNAL_CPU_RSHIM=0 mlxconfig as part of the command. 3. Perform a graceful shutdown and power cycle the host. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 335: Kernel Representors Model

    The following diagram shows the mapping of between the PCIe functions exposed on the host side and the representors. For the sake of simplicity, a single port model (duplicated for the second port) is shown. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 336: Multi-Host

    The MTU of host functions (PF/VF) must be smaller than the MTUs of both the uplink and corresponding PF/VF representor. For example, if the host PF MTU is set to 9000, both uplink and PF representor must be set to above 9000. Multi-Host NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 337 Note This is only applicable to NVIDIA® BlueField® networking platforms (DPU or SuperNIC) running on multi-host model. In multi-host mode, each host interface can be divided into up to 4 independent PCIe interfaces. All interfaces would share the same physical port, and are managed by the same multi-physical function switch (MPFS).
  • Page 338 150: pf3hpf: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 0e:0d:8e:03:2e:27 brd ff:ff:ff:ff:ff:ff 151: pf4hpf: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 5e:42:af:05:67:93 brd ff:ff:ff:ff:ff:ff NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 339 Interface p3 Port pf3hpf Interface pf3hpf Bridge armBr-2 Port p2 Interface p2 Port pf2hpf Interface pf2hpf Port armBr-2 Interface armBr-2 type: internal Bridge armBr-5 Port p5 Interface p5 Port pf5hpf Interface pf5hpf Port armBr-5 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 340 Port p4 Interface p4 Port pf4hpf Interface pf4hpf Port armBr-4 Interface armBr-4 type: internal Bridge armBr-1 Port armBr-1 Interface armBr-1 type: internal Port p1 Interface p1 Port pf1hpf Interface pf1hpf Bridge armBr-6 Port armBr-6 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 341 UP group default qlen 1000 link/ether 0c:42:a1:70:1d:9a brd ff:ff:ff:ff:ff:ff The implicit mapping is as follows: PF0, PF1 = host controller 1 PF2, PF3 = host controller 2 PF4, PF5 = host controller 3 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 342: Virtual Switch On Dpu

    NIC embedded switch and avoiding the need to pass every packet through the Arm cores. The control plane remains the same as working with standard OVS. OVS bridges are created by default upon rst boot of the DPU after BFB installation. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 343 Port p1 Interface p1 Port pf1sf0 Interface en3f1pf1sf0       Port pf1hpf Interface pf1hpf Bridge ovsbr1 Port pf0hpf Interface pf0hpf Port p0 Interface p0 Port ovsbr1 Interface ovsbr1 type: internal Port pf0sf0 Interface en3f0pf0sf0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 344 DHCP server if it is present. Otherwise it is possible to con gure IP address from the host. It is possible to access BlueField via the SF netdev interfaces. For example: 1. Verify the default OVS con guration. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 345 "2.14.1" 2. Verify whether the SF netdev received an IP address from the DHCP server. If not, assign a static IP. Run: # ifconfig enp3s0f0s0 enp3s0f0s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> 1500 inet 192.168.200.125 netmask 255.255.255.0 broadcast 192.168.200.255 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 346 Set IP address on the Windows side for the RShim or Physical network adapter, please run the following command in Command Prompt: PS C:\Users\Administrator> New-NetIPAddress -InterfaceAlias "Ethernet 16" -IPAddress "192.168.100.1" -PrefixLength 22 To get the interface name, please run the following command in Command Prompt: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 347 PS C:\Users\Administrator> Get-NetAdapter Output should give us the interface name that matches the description (e.g. NVIDIA BlueField Management Network Adapter). Ethernet 2 NVIDIA ConnectX-4 Lx Ethernet Adapter 6 Not Present 24-8A-07-0D-E8-1D Ethernet 6 NVIDIA ConnectX-4 Lx Ethernet Ad...#2 23 Not Present...
  • Page 348 $ ovs-dpctl show system@ovs-system: lookups: hit:0 missed:0 lost:0 flows: 0 masks: hit:0 total:0 hit/pkt:0.00 port 0: ovs-system (internal) port 1: armbr1 (internal) port 2: p0 port 3: pf0hpf port 4: pf0vf0 port 5: pf0vf1 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 349 <bridge-name> ovs-vsctl show Issue the command to see already con gured OVS bridges. 2. Enable the Open vSwitch service. Run: systemctl start openvswitch 3. Con gure huge pages: echo 1024 > /sys/kernel/mm/hugepages/hugepages- NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 350 -- set Bridge br0-ovs datapath_type=netdev -- br-set-external-id br0-ovs bridge-id br0-ovs -- set bridge br0-ovs fail-mode=standalone 7. Add PF to OVS. Run: ovs-vsctl add-port br0-ovs p0 -- set Interface p0 type=dpdk options:dpdk-devargs=0000:03:00.0 8. Add representor to OVS. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 351 For a reference setup con guration for BlueField-2 devices, refer to the article "Con guring OVS-DPDK O oad with BlueField-2". Con guring DPDK and Running TestPMD 1. Con gure hugepages. Run: echo 1024 > /sys/kernel/mm/hugepages/hugepages- 2048kB/nr_hugepages 2. Run testpmd. For Ubuntu/Debian: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 352 Both source NAT (SNAT) and destination NAT (DNAT) are supported with connection tracking o oad. Con guring Connection Tracking O oad NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 353 4791 and a di erent source port in each direction of the connection. RoCE tra c is not supported by CT. In order to run RoCE from the host add the following line before ovs-ofctl add-flow ctBr "table=0,ip,ct_state=- trk,action=ct(table=1)" $ ovs-ofctl add-flow ctBr table=0,udp,tp_dst=4791,action=normal NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 354 1. Con gure untracked IP packets to do nat. Run: ovs-ofctl add-flow ctBr "table=0,ip,ct_state=- trk,action=ct(table=1,nat)" 2. Con gure new established ows to do SNAT, and change source IP to 1.1.1.16. Run: ovs-ofctl add-flow ctBr "table=1,in_port=pf0hpf,ip,ct_state=+trk+new,action=ct(commit, NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 355 -L To check if speci c connections are o oaded from Arm, run for Ubuntu cat /proc/net/nf_conntrack 22.04 kernel or for older kernel versions. The following is example output of o oaded TCP connection: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 356 The change takes e ect immediately if there is no ow inside the FDB table (no tra c running and all o oaded ows are aged out), and it can be dynamically changed without reloading the driver. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 357 This changes the maximum tracked connections (both o oaded and non-o oaded) setting to 1 million. The following option speci es the limit on the number of o oaded connections. For example: # devlink dev param set pci/${pci_dev} name ct_max_offloaded_conns value $max cmode runtime NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 358 VLAN ID 52, you should use the following command when adding its representor to the bridge: $ ovs-vsctl add-port armbr1 pf0vf0 tag=52 Note If the virtual port is already connected to the bridge prior to con guring VLAN, you would need to remove it rst: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 359 1.1.1.1 IPv4 address. 2. Remove from any OVS bridge. 3. Build a VXLAN tunnel over OVS arm-ovs. Run: ovs-vsctl add-br arm-ovs -- add-port arm-ovs vxlan11 -- set interface vxlan11 type=vxlan NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 360 For the host PF, in order for VXLAN to work properly with the default 1500 MTU, follow these steps. 1. Disable host PF as the port owner from Arm (see section "Zero- trust Mode"). Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 361 OVS bridge. Note To be consistent with the examples below, it is assumed that is con gured with a 1.1.1.1 IPv4 address and that the remote end of the tunnel is 1.1.1.2. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 362 Run the following: ovs-appctl dpctl/dump-flows type=offloaded recirc_id(0),in_port(3),eth(src=50:6b:4b:2f:0b:74,dst=de:d0:a3:63:0 packets:878, bytes:122802, used:0.440s, actions:set(tunnel(tun_id=0x64,src=1.1.1.1,dst=1.1.1.2,ttl=64,flags tunnel(tun_id=0x64,src=1.1.1.1,dst=1.1.1.2,flags(+key)),recirc_id(0 packets:995, bytes:97510, used:0.440s, actions:3 Note For the host PF, in order for GRE to work properly with the default 1500 MTU, follow these steps. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 363 OVS bridge. gnv0 2. Create an OVS bridge, , with a GENEVE tunnel interface, . Run: ovs-vsctl add-port br0 gnv0 -- set interface gnv0 type=geneve options:local_ip=1.1.1.1 options:remote_ip=1.1.1.2 options:key=100 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 364 GENEVE tunnel must be smaller than the MTU of the tunnel interfaces ( ) to account for the size of the GENEVE headers. For example, you can set the MTU of P0 to 2000. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 365 The following rules push VLAN ID 100 to packets sent from VF0 to the wire (and forward it through the uplink representor) and strip the VLAN when sending the packet to the VF. $ tc filter add dev pf0vf0 protocol 802.1Q parent ffff: \ flower \ NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 366 \ action mirred egress redirect dev pf0vf0 VXLAN Encap/Decap Example $ tc filter add dev pf0vf0 protocol 0x806 parent ffff: \ flower \ skip_sw \ dst_mac e4:11:22:11:4a:51 \ src_mac e4:11:22:11:4a:50 \ NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 367: Con Guring Uplink Mtu

    O oad Using ASAP² Direct > VirtIO Acceleration through Hardware vDPA. Con guring Uplink MTU To con gure the port MTU while operating in DPU mode, users must restrict the external host port ownership by issuing the following command on the BlueField: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 368: Link Aggregation

    It increases the network throughput, bandwidth and provides redundancy if one of the interfaces fails. NVIDIA ® BlueField ® DPU has an option to con gure network bonding on the Arm side in a manner transparent to the host. Under such con guration, the host would only see a single PF.
  • Page 369 In this mode, packets are distributed according to the QPs. 1. To enable this mode, run: $ mlxconfig -d /dev/mst/<device-name> s LAG_RESOURCE_ALLOCATION=0 mt41686_pciconf0 Example device name: /etc/mellanox/mlnx-bf.conf 2. Add/edit the following eld from as follows: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 370 1. To enable this mode, run: $ mlxconfig -d /dev/mst/<device-name> s LAG_RESOURCE_ALLOCATION=1 mt41686_pciconf0 Example device name: /etc/mellanox/mlnx-bf.conf 2. Add/edit the following eld from as follows: LAG_HASH_MODE="yes" 3. Perform a graceful shutdown and system power cycle. Prerequisites NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 371 ) on the Arm side must be disconnected from any OVS bridge. LAG Con guration 1. Create the bond interface. Run: $ ip link add bond0 type bond $ ip link set bond0 down NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 372 The following is an example of LAG con guration in Ubuntu: # cat /etc/network/interfaces   # interfaces( ) file used by ifup( ) and ifdown( # Include files from /etc/network/interfaces.d: source /etc/network/interfaces.d/* auto lo NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 373 LAG support and VF-to-VF communication on same host. For OVS con guration, the bond interface is the one that needs to be added to the OVS bridge (interfaces should not be added). The PF representor for the rst port NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 374 Removing LAG Con guration LAG_RESOURCE_ALLOCATION=0 1. If Queue A nity mode LAG is con gured (i.e., 1. Delete any installed Scalable Functions (SFs) on the Arm side. 2. Stop driver (openibd) on the host side. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 375 OVS bridge on the Arm side. Refer to "Virtual Switch on DPU" for instructions on how to perform this. HIDE_PORT2_PF 6. Revert from , on the Arm side. Run: mlxconfig -d /dev/mst/<device-name> s HIDE_PORT2_PF=False NUM_OF_PF=2 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 376 1. Enable LAG hash mode. 2. Hide the second PF on the host. Run: $ mlxconfig -d /dev/mst/<device-name> s HIDE_PORT2_PF=True NUM_OF_PF=1 3. Make sure NVME emulation is disabled: $ mlxconfig -d /dev/mst/<device-name> s NVME_EMULATION_ENABLE=0 mt41686_pciconf0 Example device name: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 377 (interfaces p0 and p4 should not be added). The PF representor, must be added to the OVS bridge with the bond interface. The rest of the uplink representors must be added to another OVS bridge along with their PF representors. Consider the following examples: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 378: Scalable Functions

    An mlx5 SF has its own function capabilities and its own resources. This means that an SF has its own dedicated queues (txq, rxq, cq, eq) which are neither shared nor stolen from the parent PCIe function. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 379 1. Make sure your rmware version supports SFs (20.30.1004 and above). 2. Enable SF support in device. Run: $ mlxconfig -d 0000:03:00.0 s PF_BAR2_ENABLE=0 PER_PF_NUM_SF=1 PF_TOTAL_SF=236 PF_SF_BAR_SIZE=10 3. Cold reboot the system for the con guration to take e ect. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 380 1. Display the physical (i.e. uplink) port of the PF. Run: $ devlink port show pci/0000:03:00.0/65535: type eth netdev p0 flavour physical port 0 splittable false 2. Add an SF. Run: $ mlxdevm port add pci/0000:03:00.0 flavour pcisf pfnum 0 sfnum 88 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 381 1 pfnum 0 sfnum 88 splittable false function: hw_addr 00:00:00:00:00:00 state inactive opstate detached 3. Show the newly added devlink port by its port index or its representor device. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 382 5. Set SF as trusted (optional). Run: $ mlxdevm port function set pci/0000:03:00.0/229409 trust on pci/0000:03:00.0/229409: type eth netdev en3f0pf0sf88 flavour pcisf controller 0 pfnum 0 sfnum 88 function: hw_addr 00:00:00:00:88:88 state inactive opstate detached trust on NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 383 SF's auxiliary device. 8. By default, SF is attached to the con guration driver mlx5_core.sf_cfg. Users must unbind an SF from the con guration and bind it to the mlx5_core.sf driver to make NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 384 "type": "eth", "netdev": "en3f0pf0sf88", "flavour": "pcisf", "controller": 0, "pfnum": 0, "sfnum": 88, "function": { "hw_addr": "00:00:00:00:88:88", "state": "active", "opstate": "detached", "trust": "on" 10. View the auxiliary device of the SF. Run: $ cat /sys/bus/auxiliary/devices/mlx5_core.sf.4/sfnum NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 385 $ devlink port show auxiliary/mlx5_core.sf.4/1 auxiliary/mlx5_core.sf.4/1: type eth netdev enp3s0f0s88 flavour virtual port 0 splittable false 14. View the RDMA device for the SF. Run: $ rdma dev show $ ls /sys/bus/auxiliary/devices/mlx5_core.sf.4/infiniband/ 15. Deactivate SF. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 386: Rdma Stack Support On Host And Arm System

    RDMA Stack Support on Host and Arm System Full RDMA stack is pre-installed on the Arm Linux system. RDMA, whether RoCE or In niBand, is supported on NVIDIA® BlueField® networking platforms (DPUs or SuperNICs) in the con gurations listed below. Separate Host Mode RoCE is supported from both the host and Arm system.
  • Page 387: Controlling Host Pf And Vf Parameters

    From the Arm, users may con gure the MAC address of the physical function in the host. After sending the command, users must reload the NVIDIA driver in the host to see the newly con gured MAC address. The MAC address goes back to the default value in the FW after system reboot.
  • Page 388 Arm either Without PFs on the host, there can be no SFs on it To disable host networking PFs, run: mlxconfig -d /dev/mst/mt41686_pciconf0 s NUM_OF_PF=0 To reactivate host networking PFs: For single-port DPUs, run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 389: Dpdk On Bluefield Dpu

    Please refer to "Mellanox BlueField Board Support Package" in the DPDK documentation. BlueField SNAP on DPU NVIDIA® BlueField® SNAP (Software-de ned Network Accelerated Processing) technology enables hardware-accelerated virtualization of NVMe storage. BlueField SNAP presents networked storage as a local NVMe SSD, emulating an NVMe drive on the PCIe bus. The host OS/Hypervisor makes use of its standard NVMe-driver unaware that the communication is terminated, not by a physical drive, but by the BlueField SNAP.
  • Page 390: Compression Acceleration

    DPU. BlueField SNAP together with the DPU enable a world of applications addressing storage and networking e ciency and performance. To enable BlueField SNAP on your DPU, please contact NVIDIA Support. Compression Acceleration NVIDIA® BlueField® networking platforms (DPUs or SuperNIC) support high-speed compression acceleration.
  • Page 391 1.0.0, 1.1.1, and 3.0.2 are supported. Note With CentOS 7.6, only OpenSSL 1.1 (not 1.0) works with PKA engine openssl11 and keygen. Use with PKA engine and keygen. The engine supports the following operations: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 392: Ipsec Functionality

    IPsec operations can be run on the DPU in software on the Arm cores or in the accelerator block. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 393 IPsec packet o oad o oads both IPsec crypto and IPsec encapsulation to the hardware. IPsec packet o oad is con gured on the Arm via the uplink netdev. The following gure illustrates IPsec packet o oad operation in hardware. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 394 2. Restart IB driver (rebooting also works). Run: /etc/init.d/openibd restart Note mlx-regex is running: mlx-regex 1. Disable systemctl stop mlx-regex 2. Restart IB driver according to the command above. mlx-regex 3. Re-enable after the restart has nished: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 395 Features Overview and Con guration > Ethernet Network > IPsec Crypto O oad > Con guring Security Associations for IPsec O oads but, use "o oad packet" to achieve IPsec Packet o oad. Con guring IPsec Rules with iproute2 Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 396 192.168.1.65/24 dst 192.168.1.64/24 proto esp reqid 0x2be60844 mode transport Note reqid aead The numbers used by the , or algorithms are random. These same numbers are also used in the con guration of NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 397 Support for strongSwan IPsec packet HW o oad requires using VXLAN together with IPSec as shown here . 1. Follow the procedure under section "Enabling IPsec Packet O oad". 2. Follow the procedure under section "VXLAN Tunneling O oad" to con gure VXLAN on Arm. Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 398 (corresponding with the gure under section "IPsec Packet O oad strongSwan Support"). In this example, 192.168.50.1 is used for the left PF uplink and 192.168.50.2 for the right PF uplink. connections { BFL-BFR { local_addrs = 192.168.50.1 remote_addrs = 192.168.50.2   local { NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 399 = aes128gcm128-x25519-esn mode = transport policies_fwd_out = yes hw_offload = packet version = 2 mobike = no reauth_time = 0 proposals = aes128-sha256-x25519   secrets { ike-BF { id-host1 = host1 id-host2 = host2 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 400 Con gure packet HW o oad if supported by the kernel and hardware, fail if not supported Con gure packet HW o oad if supported by the kernel and hardware, do not fail (perform fallback to crypto or no as necessary) Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 401 [start | stop | restart] to control IPsec daemons through strongswan.service . For example, to restart, the command systemctl restart strongswan.service will e ectively do the same thing as ipsec restart NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 402 ID is "pka". Procedure: 1. Perform the following on Left and Right devices (corresponding with the gure under section "IPsec Packet O oad strongSwan Support"). # systemctl start strongswan.service # swanctl --load-all The following should appear. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 403 [ENC] parsed IKE_SA_INIT response 0 [ SA KE No N(NATD_S_IP) N(NATD_D_IP) CERTREQ N(FRAG_SUP) N(HASH_ALG) N(CHDLESS_SUP) N(MULT_AUTH) ] [CFG] selected proposal: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/CURVE_25519 [IKE] received 1 cert requests for an unknown ca [IKE] authentication of 'host1' (myself) with pre-shared key NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 404 2. Git clone https://github.com/Mellanox/strongswan.git. 3. Git checkout BF-5.9.10. This branch is based on the o cial strongSwan 5.9.10 branch with added packaging and support for DOCA IPsec plugin (check the NVIDIA DOCA IPsec Security Gateway Application Guide for more information regarding the strongSwan DOCA plugin).
  • Page 405 IPsec Packet O oad and OVS O oad IPsec packet o oad con guration works with and is transparent to OVS o oad. This means all packets from OVS o oad are encrypted by IPsec rules. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 406 OVS o oad and IPsec IPv6 do not work together. OVS IPsec To start the service, run: systemctl start openvswitch-ipsec.service Refer to section "Enabling IPsec Packet O oad" for information to prepare the IPsec packet o oad environment. Con guring IPsec Tunnel NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 407 1. Set up OVS bridges in both hosts. Arm_1 1. On ovs-vsctl add-br ovs-br ovs-vsctl add-port ovs-br $REP ovs-vsctl set Open_vSwitch . other_config:hw- offload=true Arm_2 2. On ovs-vsctl add-br ovs-br ovs-vsctl add-port ovs-br $REP ovs-vsctl set Open_vSwitch . other_config:hw- offload=true NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 408 3. Make sure the MTU of the PF used by tunnel is at least 50 bytes larger than VXLAN- REP MTU. 1. Disable host PF as the port owner from Arm (see section "Zero-trust Mode"). Run: $ mlxprivhost -d /dev/mst/mt41682_pciconf0 -- disable_port_owner r NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 409 Arm_1 1. On , run: # ovs-vsctl add-port ovs-br tun -- \ set interface tun type=gre \ options:local_ip=$ip1 \ options:remote_ip=$ip2 \ options:key=100 \ options:dst_port=1723 \ options:psk=swordfish NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 410 , if on Ubuntu, or /etc/strongswan/swanctl/private host1 , if on CentOS. For example, for mv host1-privkey.pem /etc/swanctl/private other_config 4. Set up OVS on both sides. Arm_1 1. On # ovs-vsctl set Open_vSwitch . other_config:certificate=/etc/swanctl/x509/host1- cert.pem \ NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 411 \ options:remote_ip=$ip2 options:key=100 options:dst_port=4789 \ options:remote_cert=/etc/swanctl/x509/host2-cert.pem # service openvswitch-switch restart Arm_2 2. On # ovs-vsctl add-port ovs-br vxlanp0 -- set interface vxlanp0 type=vxlan options:local_ip=$ip2 \ options:remote_ip=$ip1 options:key=100 options:dst_port=4789 \ options:remote_cert=/etc/swanctl/x509/host1-cert.pem # service openvswitch-switch restart NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 412 , if on Ubuntu, or , if on CentOS. /etc/swanctl/private 3. Move the local private key to , if on Ubuntu, or /etc/strongswan/swanctl/private host1 , if on CentOS. For example, for mv host1-privkey.pem /etc/swanctl/private NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 413 1. On # ovs-vsctl set Open_vSwitch . \ other_config:certificate=/etc/strongswan/swanctl/x509/host other_config:private_key=/etc/strongswan/swanctl/private/h privkey.pem \ other_config:ca_cert=/etc/strongswan/swanctl/x509ca/cacert Arm_2 2. On # ovs-vsctl set Open_vSwitch . \ other_config:certificate=/etc/strongswan/swanctl/x509/host other_config:private_key=/etc/strongswan/swanctl/private/h privkey.pem \ other_config:ca_cert=/etc/strongswan/swanctl/x509ca/cacert 6. Set up the tunnel: Arm_1 1. On NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 414: Troubleshooting

    Ensuring IPsec is Con gured /opt/mellanox/iproute2/sbin/ip xfrm state show . You should be able to in mode packet see IPsec states with the keyword Troubleshooting For troubleshooting information, refer to Open vSwitch's o cial documentation. fTPM over OP-TEE NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 415 Protected pseudo-persistent store for unlimited amounts of keys and data Extensive choice of authorization methods to access protected keys and data Platform identities Support for platform privacy Signing and verifying digital signatures Certifying the properties of keys and data NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 416 Emulated TPM using an isolated hardware environment Executes in an open-source trusted execution environment (OP-TEE) fTPM trusted application (TA) is part of the OP-TEE binary. This allows early access on bootup, runs only in secure DRAM. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 417 OS and, when invoked via the TEE Dispatcher, runs to completion. The fTPM TA is the only TA NVIDIA® BlueField®-3 currently supports. Any TA loaded by OP- TEE must be signed (signing done externally) and then authenticated by OP-TEE before being allowed to load and execute.
  • Page 418 The RPMB partition stores data in an authenticated, replay-protected manner, making it a perfect complement to fTPM for storing and protecting data. Enabling OP-TEE on BlueField-3 Enable OP-TEE in the UEFI menu: 1. ESC into the UEFI on DPU boot. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 419 Users can see the OP-TEE version during BlueField-3 DPU boot: The following indicators should all be present if fTPM over OP-TEE is enabled: Check "dmesg" for the OP-TEE driver initializing root@localhost ~] # dmesg | grep tee NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 420 0:00 /usr/sbin/tee-supplicant root         715  0.0  0.0      0     0 ?        I<   14:42   0:00 [optee_bus_scan]   [root@localhost ~] # ps axu | grep tpm NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 421: Qos Con Guration

    This section explains how to con gure QoS group and settings using devlink located /opt/mellanox/iproute2/sbin/ under . It is applicable to host PF/VF and Arm side SFs. The following uses VF as example. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 422 <DEV>/<GROUP_NAME> Deletes a QoS group. Syntax DEV/GROUP_NAME Speci es group name in string format Description This command deletes QoS group "12_group" from device "pci/0000:03:00.0": Example devlink port function rate del pci/0000:03:00.0/12_group Notes NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 423 This command displays a mapping between VF devlink ports and netdev names: $ devlink port flavour pcivf In the output of this command, VFs are indicated by Notes devlink port function rate set parent NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 424 [<DEV>/<GROUP_NAME> | <DEV>/<PORT_INDEX>] Displays QoS information QoS group or devlink port. DEV/GROUP_NAME Speci es the group name to display Syntax Description DEV/PORT_INDEX Speci es the devlink port to display NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 425: Virtio-Net Emulated Devices

    Virtio-net device emulation enables users to create VirtIO-net emulated PCIe devices in the system where the NVIDIA® BlueField® DPU is connected. This is done by the virtio-net- controller software module present in the DPU. Virtio-net emulated devices allow users to hot plug up to 31 virtio-net PCIe PF Ethernet NIC devices or 504 virtio-net PCIe VF Ethernet NIC devices in the host system where the DPU is plugged in.
  • Page 426 SystemD Service Controller systemd service is enabled by default and runs automatically if VIRTIO_NET_EMULATION_ENABLE is true from mlxcon g. 1. To check controller service status, run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 427 Default value is . This port is EMU manager when is 1. ib_dev_lag ib_dev_p0 ib_dev_p1 cannot be con gured simultaneously. ib_dev_for_static_pf mlx5_0 – the RDMA device (e.g., ) which the static virtio PF is created on NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 428 – virtio spec-de ned feature bits for static VFs. If unsure, leave features out of the JSON le and a default value is automatically assigned. vfs_per_pf – number of VFs to create on each PF. This is mandatory if mac_base is speci ed. Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 429 Default value is 0. For example, the following de nition has all static PFs using mlx5_0 (port 0) as the data path device in a non-lag con guration:  "ib_dev_p0": "mlx5_0",  "ib_dev_p1": "mlx5_1",  "ib_dev_for_static_pf": "mlx5_0", NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 430  "ib_dev_for_static_pf": "mlx5_bond_0",  "is_lag": 1,  "recovery": 1,  "sf_pool_percent": 0,  "sf_pool_force_destroy": 0 User Frontend To communicate with the service, a user frontend program (virtnet) is installed on the DPU. Run the following command to check its usage: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 431 To operate a particular device, either the VUID or device index can be used to locate the device. Both attributes can be fetched from command "virtnet list". For example, to modify the MAC of a speci c VF, you may run either of the following commands: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 432 OS: Features Msix_num max_queue_size For example: On the guest OS: $ echo "bdf of virtio-dev" > /sys/bus/pci/drivers/virtio-pci/unbind On the Arm side: $ virtnet modify ... NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 433 These les should not be modi ed under normal circumstances. However, if necessary, advanced users may tune settings to meet their requirements. Users are responsible for the validity of the recovery les and should only perform this when the controller is not running. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 434 --force-all -i virtio-net-controller-x.y.z- Ubuntu/Debian 1.mlnx.aarch64.deb rpm -Uvh virtio-net-controller-x.y.z- CentOS/RedHa 1.mlnx.aarch64.rpm --force It is recommended to use the following command to verify the versions of the controller currently running and the one just installed: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 435 During the update, all existing commands (e.g., ) are still supported. VF creation/deletion works as well. When the update process completes successfully, the command virtnet update status will re ect the status accordingly. Note NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 436 "migrating" for that speci c device so that user can retry later. VirtIO-net PF Devices This section covers managing virtio-net PCIe PF devices using virtio-net controller. VirtIO-net PF Device Con guration 1. Run the following command on the DPU: $ mlxconfig -d /dev/mst/mt41686_pciconf0 s NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 437 VIRTIO_NET_EMULATION_NUM_VF=0 \ VIRTIO_NET_EMULATION_NUM_PF=0 \ VIRTIO_NET_EMULATION_NUM_MSIX=10 \ SRIOV_EN=0 \ PF_SF_BAR_SIZE=10 \ PF_TOTAL_SF=64 $ mlxconfig -d /dev/mst/mt41686_pciconf0.1 s \ PF_SF_BAR_SIZE=10 \ PF_TOTAL_SF=64 5. Cold reboot the host system a second time. Creating Modern Hotplug VirtIO-net PF Device NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 438 This index is used to query and update device attributes. If the device is created successfully, an output appears similar to the following: "bdf": "85:00.0", "vuid": "VNETS1D0F0", "id": 3, "sf_rep_net_device": "en3f0pf0sf2000", "mac": "0C:C4:7A:FF:22:93" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 439 5. To modify device attributes, for example, changing its MAC address, run: $ virtnet modify -p 0 device -m 0C:C4:7A:FF:22:98 6. Once usage is complete, to hot-unplug a virtio-net device, run: $ virtnet unplug -p 0 Creating Transitional Hotplug VirtIO-net PF Device NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 440 5. To create a transitional hotplug virtio-net device. Run the following command on the --legacy DPU (with additional $ virtnet hotplug -i mlx5_0 -f 0x0 -m 0C:C4:7A:FF:22:93 -t 1500 -n 3 -s 1024 -l NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 441 NIC. The number of supported hotplug transitional virtio device equals: (allocated I/O port space – 4k) / 4k. Virtio-net SR-IOV VF Devices This section covers managing virtio-net PCIe SR-IOV VF devices using virtio-net-controller. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 442 Virtio-net SR-IOV VF is only supported with statically con gured PF, hot-plugged PF is not currently supported. 1. On the DPU, make sure virtio-net-controller service is enabled so that it starts automatically. Run: systemctl status virtio-net-controller.service NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 443 7. Apply the following con guration on the DPU in three steps to support up to 125 VFs per PF (500 VFs in total). $ mst start && mlxconfig -d /dev/mst/mt41686_pciconf0 s PF_BAR2_ENABLE=0 PER_PF_NUM_SF=1 $ mlxconfig -d /dev/mst/mt41686_pciconf0 s \ NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 444 # lspci | grep -i virtio 85:00.3 Network controller: Red Hat, Inc. Virtio network device 2. On the host, make sure virtio_pci and virtio_net are loaded. Run: # lsmod | grep virtio The net device should be created: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 445 2 VFs should be created from the host: # lspci | grep -i virt 85:00.3 Network controller: Red Hat, Inc. Virtio network device 85:04.5 Network controller: Red Hat, Inc. Virtio network device NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 446 "net_mtu": 1500 You may use the pci-bdf to match the PF/VF on the host to the information showing on DPU. To query all the device con gurations of the virtio-net device of that VF, run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 447 When the command returns from the host OS, it does not necessarily mean the controller nished its operations. Look at controller log from the DPU and make sure you see a log like below before removing virtio kernel modules or recreate VFs. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 448 Transitional virtio-net VF devices are not currently supported. Virtio VF PCIe Devices for vHost Acceleration Virtio VF PCIe devices can be attached to the guest VM using vhost acceleration software stack. This enables performing live migration of guest VMs. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 449 Minimum hypervisor kernel version – Linux kernel 5.7 (for VFIO SR-IOV support) Install vHost Acceleration Software Stack Vhost acceleration software stack is built using open-source BSD licensed DPDK. To install vhost acceleration software: 1. Clone the software source code. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 450 -y numactl-devel libev-devel meson build -Dexamples=vdpa ninja -C build To install QEMU: Info Upstream QEMU later then 8.1 can be used or the following QEMU. 1. Clone QEMU sources. git clone https: //github.com/Mellanox/qemu -b stable-8.1-presetup Info NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 451 3. Setup the hypervisor system: 1. Con gure hugepages and libvirt VM XML (see OVS Hardware O oads Con guration for information on doing that). 2. Add a virtio-net interface and a virtio-blk interface in VM XML. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 452 4. Create block device on the DPU: spdk_rpc.py bdev_null_create Null0 1024 512 snap_rpc.py controller_virtio_blk_create --pf_id 0 -- bdev_type spdk mlx5_0 --bdev Null0 --num_queues 1 --admin_q - -force_in_order 5. On BlueField-3 SNAP: spdk_rpc.py bdev_null_create Null0 1024 512 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 453 2. Enable SR-IOV and create a VF(s): echo 1 > /sys/bus/pci/devices/0000:af:00.2/sriov_numvfs echo 1 > /sys/bus/pci/devices/0000:af:00.3/sriov_numvfs   lspci | grep Virtio af:00.2 Ethernet controller: Red Hat, Inc. Virtio network device af:00.3 Non-Volatile memory controller: Red Hat, Inc. Virtio block device NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 454 # Wait on virtio-net-controller finishing handle PF FLR   # On DPU, change VF MAC address or other device options virtnet modify -p 0 -v 0 device -m 00:00:00:00:33:00 python ./app/vfe-vdpa/vhostmgmt vf -a 0000:af:04.5 -v /tmp/vfe-net0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 455 --pf_id 0 -- vf_id 0 --bdev_type spdk --bdev Null0 --force_in_order python ./app/vfe-vdpa/vhostmgmt vf -a 0000:af:05.1 -v /tmp/vfe-blk0 Note If the SR-IOV is disabled and reenabled, the user must re- provision the VFs. Start the VM NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 456: Shared Rq Mode

    When creating 1 send queue (SQ) and 1 receive queue (RQ), each representor consumes ~3MB memory per single channel. Scaling this to the desired 1024 representors (SFs and/or VFs) would require ~3GB worth of memory for single channel. A major chunk of the NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 457 The following behavior is observed in shared RQ mode: rx_bytes rx_packets It is expected to see a in the and valid vport_rx_packets vport_rx_bytes after running tra c. Example output: # ethtool -S pf0hpf NIC statistics: rx_packets: 0 rx_bytes: 0 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 458: Regex Acceleration

    RX side using ethtool. Changing channels also only a ects the TX side. RegEx Acceleration NVIDIA® BlueField® DPU supports high-speed RegEx acceleration. This allows the host to o oad multiple RegEx jobs to the DPU. This feature can be used from the host or from the Arm side.
  • Page 459 1 > /sys/bus/pci/devices/0000\:03\:00.0/regex/pf/regex_en NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 460: Upgrading Boot Software

    Trusted Boot Firmware BL2 certi cate BL1/BL2R Trusted Boot Firmware BL2 BL1/BL2R trusted-key-cert Trusted key certi cate bl31-key-cert EL3 Runtime Firmware BL3-1 key certi cate bl31-cert EL3 Runtime Firmware BL3-1 certi cate 13 BL2 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 461 BL2 certi cate are read by BL2R. Thus, the BL2 image and certi cate are read by BL1. BL2R is not booted in BlueField-1 devices. Before explaining the implementation of the solution, the BlueField boot process needs to be expanded upon. BlueField Boot Process NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 462 In most deployments, the Arm cores of BlueField are expected to obtain their bootloader from an on-board eMMC device. Even in environments where the nal OS kernel is not kept on eMMC—for instance, systems which boot over a network—the initial booter code still comes from the eMMC. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 463 ROM. It is executed when the device is reset. bl2r.bi The secure rmware (RIoT core) image. This image provides support for crypto operation and calculating measurements for security attestation and is relevant NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 464 "bfrec" uses the default boot le /lib/firmware/mellanox/boot/default.bfb to update the boot partitions of /dev/mmcblk0 device . This might be done directly in an OS using the "mlxbf-bootctl" utility, or at a later stage after reset using the capsule interface. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 465 For example, if the new bootstream le which we would like to install and validate is called newdefault.bfb , download the le to the BlueField and update the eMMC boot partitions by executing the following commands from the BlueField console: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 466 $ /opt/mellanox/scripts/bfver mlxbf-bootctl mlxbf-bootctl It is also possible to update the eMMC boot partitions directly with the /sbin tool. The tool is shipped as part of the software image (under ) and the sources NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 467 --bootstream (used with " ") – specify a le to which to write the boot partition data (creating it if necessary), rather than using an existing master device and deriving the boot partition device NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 468 In the example above, 60 seconds are allowed from system reset until the Linux watchdog kernel driver is loaded. At that point, the user’s application may open /dev/watchdog explicitly, NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 469 (Linux Vendor Firmware Service). LVFS is a free service operated by the Linux Foundation, which allows vendors to host stable rmware images for easy download and installation. Note The DPU must have a functioning connection to the Internet. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 470 UEFI capsule update, without upgrading the root le system. If your system is already at the latest available version, this upgrade command will do nothing. 4. Reboot the DPU to complete the upgrade. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 471 The bootloader images are embedded within the BSD under . It is also possible to build the binary images from sources. Please refer to the following sections for further details. 1. First, set the PATH variable: $ export PATH=$PATH:<BF_INST_DIR>/bin NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 472: Boot Option

    The UEFI boot manager can be con gured; boot entries may be added or removed from the boot menu. The UEFI rmware can also e ectively generate entries in this boot menu, according to the available network interfaces and possibly the disks attached to the system. Boot Option NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 473 List UEFI Boot Options To display the boot option already installed in the NVIDIA® BlueField® system, reboot and go to the UEFI menu screen. To get to the UEFI menu, hit Esc when prompted (in the RShim or UART console) before the countdown timer runs out.
  • Page 474 It is also possible to retrieve more details about the boot entries. To do so, select "EFI Internal Shell" entry from the Boot Manager screen. UEFI Interactive Shell v2.1 EDK II UEFI v2.50 (EDK II, 0x00010000) Mapping table FS1: Alias(s):F1: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 475 00000010: 74 00 74 00 79 00 41 00-4D 00 41 00 30 00 20 00 *t.t.y.A.M.A.0. .* 00000020: 65 00 61 00 72 00 6C 00-79 00 63 00 6F 00 6E 00 *e.a.r.l.y.c.o.n.* NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 476 00000070: 6C 00 6B 00 30 00 70 00-32 00 20 00 72 00 6F 00 *l.k.0.p.2. .r.o.* 00000080: 6F 00 74 00 77 00 61 00-69 00 74 00 *o.t.w.a.i.t.* Option: 02. Variable: Boot0003 Desc - EFI Misc Device DevPath - VenHw(8C91E049-9BF9-440E-BBAD-7DC5FC082C02) NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 477 Option: 07. Variable: Boot0008 Desc - EFI Internal Shell DevPath - MemoryMapped(0xB,0xFE5FE000,0xFEAE357F)/FvFile(7C04A583-9E3E- 4F1C-AD65-E05268D0B4D1) Optional- N Note Boot arguments are printed in Hex mode, but you may recognize the boot parameters printed on the side in ASCII format. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 478 Reset MFG Info – clears the manufacturing information Note All the above options, except for password and the two reset options, are also programmatically con gurable via the BlueField Linux /etc/bf.cfg . Refer to section "bf.cfg Parameters" for further information. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 479 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 480: Troubleshooting And How-Tos

    RShim Troubleshooting and How-Tos Another backend already attached Several generations of NVIDIA® BlueField® networking platforms (DPUs or SuperNICs) are equipped with a USB interface in which RShim can be routed, via USB cable, to an external host running Linux and the RShim driver.
  • Page 481 Product Name: BlueField-2 DPU 25GbE Dual-Port SFP56, integrated BMC, Crypto and Secure Boot Enabled, 16GB on-board DDR, 1GbE OOB management, Tall Bracket, FHHL If your BlueField has an integrated BMC, refer to RShim driver not loading on host with integrated BMC. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 482 4. Restart RShim service. Run: sudo systemctl restart rshim If RShim service does not launch automatically, run: sudo systemctl status rshim active (running) This command is expected to display " ". 5. Display the current setting. Run: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 483 2. Delete RShim on the host. Run: systemctl stop rshim systemctl disable rshim 3. Enable RShim on the BMC. Run: systemctl enable rshim systemctl start rshim 4. Display the current setting. Run: # cat /dev/rshim<N>/misc | grep DEV_NAME NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 484 For Ubuntu/Debian, run: sudo dpkg --force-all -i rshim-<version>.deb For RHEL/CentOS, run: sudo rpm -Uhv rshim-<version>.rpm 3. Restart RShim service. Run: sudo systemctl restart rshim If RShim service does not launch automatically, run: sudo systemctl status rshim NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 485 The product name is supposed to show "integrated BMC" . 2. Access the BMC via the RJ45 management port of BlueField. 3. Delete RShim on the BMC: systemctl stop rshim systemctl disable rshim 4. Enable RShim on the host: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 486 How to support multiple BlueField devices on the host For more information, refer to section "RShim Multiple Board Support". BFB installation monitoring The BFB installation ow can be traced using various interfaces: From the host: /dev/rshim0/console RShim console ( NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 487: Connectivity Troubleshooting

    Follow this procedure: 1. Connect the UART cable to a USB socket, and nd it in your USB devices. sudo lsusb Bus 002 Device 003: ID 0403:6001 Future Technology Devices International, Ltd FT232 Serial (UART) IC NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 488 2. Install the minicom application. For CentOS/RHEL: sudo yum install minicom -y For Ubuntu/Debian: sudo apt-get install minicom 3. Open the minicom application. sudo minicom -s -c on 4. Go to "Serial port setup" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 489 : 115200 8N1 | F - Hardware Flow Control : No | G - Software Flow Control : No Change which setting? +-------------------------------------------------------- ---------------+ Driver not loading in host server What this looks like in dmsg: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 490 The driver is loaded in the BlueField (Arm) The Arm is booted into OS The Arm is not in UEFI Boot Menu The Arm is not hanged Then: 1. Perform a graceful shutdown and a power cycle on the host server. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 491 3. If this problem still persists, please make sure to install the latest bfb image and then restart the driver in host server. Please refer to "Upgrading NVIDIA BlueField DPU Software" for more information. No connectivity between network interfaces of source host to destination device Verify that the bridge is con gured properly on the Arm side.
  • Page 492: Performance Troubleshooting

    Please check that the cables are connected properly into the network ports of the DPU and the peer device. Performance Troubleshooting Degradation in performance Degradation in performance indicates that openvswitch may not be o oaded. Verify o oad state. Run: # ovs-vsctl get Open_vSwitch . other_config:hw-offload NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 493: Pcie Troubleshooting And How-Tos

    To verify how much power is supported on your host's PCIe slots, run the command lspci -vvv | grep PowerLimit . For example: # lspci -vvv | grep PowerLimit Slot #6, PowerLimit 75.000W; Interlock- NoCompl- NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 494 Be aware that this command is not supported by all host vendors/types. HowTo update PCIe device description lspci may not present the full description for the NVIDIA PCIe devices connected to your host. For example: # lspci | grep -i Mellanox a3:00.0 Infiniband controller: Mellanox Technologies Device a2d6...
  • Page 495: Sr-Iov Troubleshooting

    2. Verify is true and bigger than 1. Run: # mlxconfig -d /dev/mst/mt41686_pciconf0 -e q |grep -i "SRIOV_EN\|num_of_vf" Configurations: Default Current Next Boot NUM_OF_VFS SRIOV_EN True(1) True(1) True(1) 3. Verify that GRUB_CMDLINE_LINUX="iommu=pt intel_iommu=on pci=assign-busses" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 496 3. Verify VF con guration. Run: $ ovs-vsctl show bb993992-7930-4dd2-bc14-73514854b024 Bridge ovsbr1 Port pf0vf0 Interface pf0vf0 type: internal Port pf0hpf Interface pf0hpf Port pf0sf0 Interface pf0sf0 Port p0 Interface p0 Bridge ovsbr2 Port ovsbr2 Interface ovsbr2 type: internal NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 497: Eswitch Troubleshooting

    # /opt/mellanox/iproute2/sbin/rdma link | grep -i up link mlx5_0/2 state ACTIVE physical_state LINK_UP netdev pf0vf0 link mlx5_1/2 state ACTIVE physical_state LINK_UP netdev pf1vf0 If any VFs are con gured, destroy them by running: # echo 0 > /sys/class/infiniband/mlx5_0/device/mlx5_num_vfs NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 498 RDMA dev: mlx5_2   SF Index: pci/0000:03:00.1/294944 Parent PCI dev: 0000:03:00.1 Representor netdev: en3f1pf1sf0 Function HWADDR: 02:30:13:6a:2d:2c Auxiliary device: mlx5_core.sf.3 netdev: enp3s0f1s0 RDMA dev: mlx5_3 Pay attention to the SF Index values. For example: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 499   NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 500 Please con gure the DPU to work in switchdev mode. Run: devlink dev eswitch set pci/0000:03:00.<0|1> mode switchdev Check if you are working in separated mode: # mlxconfig -d /dev/mst/mt41686_pciconf0 q | grep -i cpu * INTERNAL_CPU_MODEL SEPERATED_HOST(0) NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 501: Isolated Mode Troubleshooting And How-Tos

    Ensure that the DPU is placed correctly Make sure the DPU slot and the DPU are compatible Install the DPU in a di erent PCI Express slot Use the drivers that came with the DPU or download the latest NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 502: Installation Troubleshooting And How-Tos

    Link light is on but no communication is established Check that the latest driver is loaded Check that both the DPU and its link are set to the same speed and duplex settings Installation Troubleshooting and How- NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 503 # Enable eMMC boot partition protection. #SYS_BOOT_PROTECT = FALSE   # Enable SPCR table in ACPI. #SYS_ENABLE_SPCR = FALSE   # Disable PCIe in ACPI. #SYS_DISABLE_PCIE = FALSE   # Enable OP-TEE in ACPI. #SYS_ENABLE_OPTEE = FALSE   ################################################################### NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 504 # DHCP class identifier for PXE (arbitrary string up to 32 characters) #PXE_DHCP_CLASS_ID = NVIDIA/BF/PXE   # Create dual boot partition scheme (Ubuntu only) # DUAL_BOOT=yes   # Upgrade NIC firmware # WITH_NIC_FW_UPDATE=yes   NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 505 If the .bfb le cannot recognize the BlueField board type, it reverts to low core operation. The following message will be printed on your screen: ***System type can't be determined*** ***Booting as a minimal system*** NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 506 Please contact NVIDIA Support if this occurs. Unable to load BL2, BL2R, or PSC image The following errors appear in console if images are corrupted or not signed properly: Device Error ERROR: Failed to load BL2 firmware BlueField ERROR: Failed to load BL2R firmware...
  • Page 507 How to upgrade the host RShim driver <BF_INST_DIR>/src/drivers/rshim/README See the readme at How to upgrade the boot partition (ATF & UEFI) without re-installation 1. Boot the target through the RShim interface from a host machine: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 508 UEFI boot menu and use the arrows to select the menu option. It could take 1-2 minutes to enter the Boot Manager depending on how many devices are installed or whether the EXPROM is programmed or not. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 509 This allows the Linux kernel on BlueField to be debugged over the serial port. A single serial port cannot be used both as a console and by KGDB at the same time. It is /dev/rshim0/console recommended to use the RShim for console access ( ) and the NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 510 <BF_INST_DIR>/sdk/sysroots/x86_64-pokysdk-linux/usr/bin/aarch64- poky-linux/aarch64-poky-linux-gdb <BF_INST_DIR>/sample/vmlinux   (gdb) target remote /dev/ttyUSB3 Remote debugging using /dev/ttyUSB3 arch_kgdb_breakpoint () at /labhome/dwoods/src/bf/linux/arch/arm64/include/asm/kgdb.h:32 asm ("brk %0" : : "I" (KGDB_COMPILED_DBG_BRK_IMM)); (gdb) NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 511 How to change the default console of the install image On UART0: $ echo "console=ttyAMA0 earlycon=pl011,0x01000000 initrd=initramfs" > bootarg $ <BF_INST_DIR>/bin/mlx-mkbfb --boot-args bootarg \ <BF_INST_DIR>/sample/ install.bfb On UART1: $ echo "console=ttyAMA1 earlycon=pl011,0x01000000 initrd=initramfs" > bootarg $ <BF_INST_DIR>/bin/mlx-mkbfb --boot-args bootarg \ NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 512 /var/lib/cloud/seed/nocloud-net/network-config The default content of follows: # cat /var/lib/cloud/seed/nocloud-net/network-config version: 2 renderer: NetworkManager ethernets: tmfifo_net0: dhcp4: false addresses: - 192.168.100.2/30 nameservers: addresses: [ 192.168.100.1 ] routes: - to: 0.0.0.0/0 via: 192.168.100.1 metric: 1025 oob_net0: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 513 Sanitizing DPU eMMC and SSD Storage During the BFB installation process, DPU storage can be securely sanitized either using shred nvme bf.cfg or the utilities in the con guration le as illustrated in the following subsections. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 514 /tmp/sanitize.emmc.log 2>&1 if [ -e /dev/nvme0n1 ]; then echo Sanitizing /dev/nvme0n1 | tee /dev/kmsg echo Sanitizing /dev/nvme0n1 > /tmp/sanitize.ssd.log nvme sanitize /dev/nvme0n1 -a 2 >> /tmp/sanitize.ssd.log 2>&1 nvme sanitize-log /dev/nvme0n1 >> /tmp/sanitize.ssd.log 2>&1 SANITIZE_DONE=1 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 515 [ -e /dev/mmcblk0 ]; then echo Sanitizing /dev/mmcblk0 | tee /dev/kmsg echo Sanitizing /dev/mmcblk0 > /tmp/sanitize.emmc.log mmc sanitize /dev/mmcblk0 >> /tmp/sanitize.emmc.log 2>&1 if [ -e /dev/nvme0n1 ]; then echo Sanitizing /dev/nvme0n1 | tee /dev/kmsg NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 516 Graceful shutdown of the Arm OS ensures that data within the eMMC/NVMe cache is properly written to storage, and helps prevent lesystem inconsistencies and le corruption. There are several ways to gracefully shutdown the DPU Arm OS: NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 517 2. After DPU Arm OS shutdown, it is recommended to issue DPU Arm OS state query which indicates whether DPU Arm OS shutdown has completed ( standby indication). This can be done by issuing the Get Smart NIC OS State NC-SI OEM command. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 518: Windows Support

    Verifying RShim Drivers Installation 1. Open the Device Manager when no drivers are installed to make sure a new PCIe device is available as below. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 519 RShim drivers can be connect via PCIe (the drivers we are providing) or via USB (external connection) but not both at the same time. So when the bus driver detects that an external USB is already attached, NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 520 3. Run the following command in order to know what to set the "Serial line" eld to: C:\Users\username\Desktop> reg query HKLM\HARDWARE\DEVICEMAP\SERIALCOMM | findstr MlxRshim \MlxRshim\COM3 REG-SZ COM3 In this case use COM3. This name can also be found via Device Manager under "Ports (Com & LPT)". NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 521 4. Press Open and hit Enter. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 522 To access via BlueField management network adapter, con gure an IP address as shown in the example below and run a ping test to con rm con guration. NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 523 The network address of the CA-FE-01-CA-FE- e325-11ce-bfc1- device. The format for a MAC 02 (default) 08002be10318}\ address is: XX-XX-XX-XX-XX-XX. <nn>\*NetworkAddress HKLM\SYSTEM\CurrentControlS The number of receive descriptors 16 – 64 (Default) et\Control\Class\{4d36e972- used by the miniport adapter. e325-11ce-bfc1- NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 524 Push a boot stream le ( ). A BFB le is a generated BlueField boot stream le that contains Linux operating system image that runs on BlueField. BFB les can be downloaded from the NVIDIA DOCA SDK webpage. RshimCmd -RestartSmartNic <Option> -BusNum <BusNum> Usage...
  • Page 525 To include the le into the BFB installation, append the le to BFB le as described below: 1. Copy the BFB le to a local folder. For example: Copy <path>\DOCA_1. .0_BSP_3. .2_Ubuntu_20. .bfb 5.20220707 c:\bf\MlnxBootImage.bfb NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 526 The default location to the trace is at %SystemRoot%\system32\LogFiles\Mlnx\Mellanox-WinOF2-System.etl The following are the Event logs RShim drivers generate: MlxRShimBus Driver Even Severit Message t ID Inform RShim Bus driver loaded successfully ational Inform Device successfully stopped ational NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 527 Informatio Device is successfully stopped Warning Value read from registry is invalid. Therefore use the default value. Error SmartNIC seems stuck as transmit packets are not being drained. Informatio RShim Ethernet driver loaded successfully NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 528: Document Revision History

    Section "Enabling IPsec Packet O oad" Section "Setting IPSec Packet O oad Using strongSwan" Section "Running strongSwan Example" Section "Building strongSwan" Section "IPsec Packet O oad and OVS O oad" Rev 4.2.2 – October 24, 2023 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 529 Section "Unable to load BL2, BL2R, or PSC image" Updated: Section "Default Ports and OVS Con guration" with new step 2 gpio-mlxbf3 mlxbf-ptm pwr-mlxbf Section "BlueField Linux Drivers" with pinctrl-mlxbf Page "Updating DPU Software Packages Using Standard Linux Tools" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 530 Section "Disabling Host Networking PFs" by adding instructions for reactivating host networking for single-port DPUs Section "Con guring RegEx Acceleration on BlueField-2" Section "Virtio-net SR-IOV VF Device Con guration" PXE_DHCP_CLASS_ID in section "bf.cfg Parameters" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 531 Section "Enrolling Certi cates Using Capsule" Section "NIC Mode" with supported MLNX_OFED versions Section "PKA Use Cases" with support for OpenSSL version 3.0.2 Rev 3.9 – May 03, 2022 Added: Section "GRUB Password Protection" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 532 Section "Building Your Own BFB Installation Image" Section "Con guring VXLAN Tunnel" Step 2 in section "Prerequisites" Section "Enabling IPsec Full O oad" Code block under step 1 in section "LAG Con guration" Rev 3.8.5 – January 19, 2022 NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 533 Added: Section "Another backend already attached" Updated: Section "Ensure RShim Running on Host" NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 534: Legal Notices And 3Rd Party Licenses

    23.10-7 Link BlueField BMC 3 Party Unify Notice Virtio Network Controller Link Link Virtio Network Controller 3 Party Notice 1.7.13 Virtio Network Controller 3 Party Unify Link Notice MLNX LibSnap and virtio-blk 1.6.0-1 Link NVIDIA BlueField DPU BSP v4.5.3 LTS...
  • Page 535 NVIDIA accepts no liability related to any default, damage, costs, or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this document or (ii) customer product designs.<br/><br/>No license, either expressed or implied, is granted under any NVIDIA patent right, copyright,...
  • Page 536 NVIDIA under the patents or other intellectual property rights of NVIDIA.<br/><br/><br/><br/>Reproduction of information in this document is permissible only if approved in advance by NVIDIA in writing, reproduced without alteration and in full compliance with all applicable export laws and regulations, and accompanied by all associated conditions, limitations, and notices.<br/><br/><br/><br/>THIS...

Table of Contents