Table of Contents

Advertisement

Quick Links

OFED+ Host Software
Release 1.5.4
User Guide
IB0054606-02 A

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the OFED+ Host and is the answer not in the manual?

Questions and answers

Summary of Contents for Qlogic OFED+ Host

  • Page 1 OFED+ Host Software Release 1.5.4 User Guide IB0054606-02 A...
  • Page 2: Document Revision History

    QLogic Corporation reserves the right to change product specifications at any time without notice. Applications described in this document for any of these products are for illustrative purposes only. QLogic Corporation makes no representation nor warranty that such applications are suitable for the specified use without further testing or modification.
  • Page 3: Table Of Contents

    Table of Contents Preface Intended Audience ..........Related Materials .
  • Page 4 Subnet Manager Configuration ........3-10 QLogic Distributed Subnet Administration ......3-12 Applications that use Distributed SA .
  • Page 5 Introduction..........MPIs Packaged with QLogic OFED+ ......
  • Page 6 OFED+ Host Software Release 1.5.4 User Guide Debugging MPI Programs ........4-22 MPI Errors .
  • Page 7 Running programs without using shmemrun ....QLogic SHMEM Relationship with MPI ......
  • Page 8 QLogic SRP Configuration ........
  • Page 9 OFED+ Host Software Release 1.5.4 User Guide Configuring SRP for Native IB Storage ......B-21 Notes ..........B-23 Additional Details .
  • Page 10 OFED+ Host Software Release 1.5.4 User Guide Open MPI Troubleshooting ........D-12 Invalid Configuration Warning .
  • Page 11 OFED+ Host Software Release 1.5.4 User Guide iba_packet_capture....G-21 ibhosts ..... G-22 ibstatus.
  • Page 12 QLogic OFED+ Software Structure ........
  • Page 13 QLogic SHMEM all-to-all benchmark options ......6-29 QLogic SHMEM barrier benchmark options....... 6-30 QLogic SHMEM reduce benchmark options .
  • Page 14 OFED+ Host Software Release 1.5.4 User Guide IB0054606-02 A...
  • Page 15: Preface

    Preface The QLogic OFED+ Host Software User Guide shows end users how to use the installed software to setup the fabric. End users include both the cluster administrator and the Message-Passing Interface (MPI) application programmers, who have different but overlapping interests in the details of the technology.
  • Page 16: License Agreements

    License Agreements Refer to the QLogic Software End User License Agreement for a complete listing of all license agreements affecting this product. IB0054606-02 A...
  • Page 17: Technical Support

    (IB), ® and Fibre Channel products. From the main QLogic web page at www.qlogic.com, click the Support tab at the top, and then click Training and Certification on the left. The QLogic Global Training portal offers online courses, certification exams, and scheduling of in-person training.
  • Page 18: Knowledge Database

    Technical Support Knowledge Database The QLogic knowledge database is an extensive collection of QLogic product information that you can search for specific solutions. We are constantly adding to the collection of information in our database to provide answers to your most urgent questions.
  • Page 19: Introduction

    Introduction How this Guide is Organized The QLogic OFED+ Host Software User Guide is organized into these sections:  Section 1, provides an overview and describes interoperability.  Section 2, describes how to setup your cluster to run high-performance MPI jobs.
  • Page 20: Overview

    Fabric ® Software Installation Guide contains information on QLogic software installation. Overview The material in this documentation pertains to a QLogic OFED+ cluster. A cluster is defined as a collection of nodes, each attached to an InfiniBand -based fabric ®...
  • Page 21: Interoperability

    QLogic offers the QLogic Embedded Fabric Manager (FM) for both DDR and QDR switch product lines supplied by your IB switch vendor.  A host-based subnet manager can be used. QLogic provides the QLogic Fabric Manager (FM), as a part of the QLogic InfiniBand Fabric Suite (IFS).
  • Page 22 1–Introduction Interoperability IB0054606-02 A...
  • Page 23: Step-By-Step Cluster Setup And Mpi Usage Checklists

    QLogic InfiniBand Adapter Hardware Installation Guide ® and software installation and driver configuration has been completed according to the instructions in the QLogic InfiniBand Fabric Software ® Installation Guide. To minimize management problems, the compute nodes of the cluster must have very similar hardware configurations and identical software installations.
  • Page 24: Intel

    “Checking Cluster and Software Status” on page 3-44. Using MPI Verify that the QLogic hardware and software has been installed on all the nodes you will be using, and that ssh is set up on your cluster (see all the steps in the Cluster Setup checklist).
  • Page 25: Cluster Setup

    The IB driver ib_qib, QLogic Performance Scaled Messaging (PSM), accelerated Message-Passing Interface (MPI) stack, the protocol and MPI support libraries, and other modules are components of the QLogic OFED+ software. This software provides the foundation that supports the MPI implementation.
  • Page 26: Installed Layout

    /usr/share/doc/infinipath License information is found only in . QLogic usr/share/doc/infinipath OFED+ Host Software user documentation can be found on the QLogic web site on the software download page for your distribution. Configuration files are found in: /etc/sysconfig Init scripts are found in: /etc/init.d...
  • Page 27: Ib And Openfabrics Driver Overview

    OpenSM. This component is disabled at startup. QLogic recommends using  the QLogic Fabric Manager (FM), which is included with the IFS or optionally available within the QLogic switches. QLogic FM or OpenSM can be installed on one or more nodes with only one node being the master SM.
  • Page 28 3–InfiniBand Cluster Setup and Administration ® IPoIB Network Interface Configuration This example assumes that no hosts files exist, the host being configured has the IP address 10.1.17.3, and DHCP is not used. NOTE Instructions are only for this static IP address case. Configuration methods for using DHCP will be supplied in a later release.
  • Page 29: Ipoib Administration

    Fabric Software Installation Guide for more ® information on using the QLogic IFS Installer TUI. Refer to the QLogic FastFabric User Guide for more information on using FastFabric. For using the command line to configure the IPoIB driver use the following commands.
  • Page 30: Ib Bonding

    Linux Ethernet Bonding Driver and was adopted to work with IPoIB. The support for IPoIB interfaces is only for the active-backup mode, other modes should not be used. QLogic supports bonding across HCA ports and bonding port 1 and port 2 on the same HCA.
  • Page 31: Red Hat El5 And El6

    3–InfiniBand Cluster Setup and Administration ® IB Bonding Red Hat EL5 and EL6 The following is an example for bond0 (master). The file is named /etc/sysconfig/network-scripts/ifcfg-bond0: DEVICE=bond0 IPADDR=192.168.1.1 NETMASK=255.255.255.0 NETWORK=192.168.1.0 BROADCAST=192.168.1.255 ONBOOT=yes BOOTPROTO=none USERCTL=no MTU=65520 BONDING_OPTS="primary=ib0 updelay=0 downdelay=0" The following is an example for ib0 (slave). The file is named /etc/sysconfig/network-scripts/ifcfg-ib0: DEVICE=ib0 USERCTL=no...
  • Page 32: Suse Linux Enterprise Server (Sles) 10 And 11

    3–InfiniBand Cluster Setup and Administration ® IB Bonding SuSE Linux Enterprise Server (SLES) 10 and 11 The following is an example for bond0 (master). The file is named /etc/sysconfig/network-scripts/ifcfg-bond0: DEVICE="bond0" TYPE="Bonding" IPADDR="192.168.1.1" NETMASK="255.255.255.0" NETWORK="192.168.1.0" BROADCAST="192.168.1.255" BOOTPROTO="static" USERCTL="no" STARTMODE="onboot" BONDING_MASTER="yes" BONDING_MODULE_OPTS="mode=active-backup miimon=100 primary=ib0 updelay=0 downdelay=0"...
  • Page 33: Verify Ib Bonding Is Configured

    3–InfiniBand Cluster Setup and Administration ® IB Bonding Verify the following line is set to the value of yes in /etc/sysconfig/boot: RUN_PARALLEL="yes" Verify IB Bonding is Configured After the configuration scripts are updated, and the service network is restarted or a server reboot is accomplished, use the following CLI commands to verify that IB bonding is configured.
  • Page 34: Subnet Manager Configuration

    RX bytes:141223648 (134.6 Mb) TX bytes:147950000 (141.0 Mb) Subnet Manager Configuration QLogic recommends using the QLogic Fabric Manager to manage your fabric. Refer to the QLogic Fabric Manager User Guide for information on configuring the QLogic Fabric Manager. 3-10 IB0054606-02 A...
  • Page 35 You cannot use OpenSM if any of your IB switches provide a subnet manager, or if you are running a host-based SM, for example the QLogic Fabric Manager.
  • Page 36: Qlogic Distributed Subnet Administration

    Applications that use Distributed SA The QLogic PSM Library has been extended to take advantage of the Distributed SA. Therefore, all MPIs that use the QLogic PSM library can take advantage of the Distributed SA. Other applications must be modified specifically to take advantage of it.
  • Page 37: Virtual Fabrics And The Distributed Sa

    Virtual Fabrics and the Distributed SA The IBTA standard states that applications can be identified by a Service ID (SID). The QLogic Fabric Manager uses SIDs to identify applications. One or more applications can be associated with a Virtual Fabric using the SID. The Distributed SA is designed to be aware of Virtual Fabrics, but to only store records for those Virtual Fabrics that match the SIDs in the Distributed SA's configuration file.
  • Page 38: Multiple Virtual Fabrics Example

    ® QLogic Distributed Subnet Administration If you are using the QLogic Fabric Manager in its default configuration, and you are using the standard QLogic PSM SIDs, this arrangement will work fine and you will not need to modify the Distributed SA's configuration file - but notice that the Distributed SA has restricted the range of SIDs it cares about to those that were defined in its configuration file.
  • Page 39: Virtual Fabrics With Overlapping Definitions

    3–InfiniBand Cluster Setup and Administration ® QLogic Distributed Subnet Administration Figure 3-4. Distributed SA Multiple Virtual Fabrics Configured Example Virtual Fabrics with Overlapping Definitions As defined, SIDs should never be shared between Virtual Fabrics. Unfortunately, it is very easy to accidentally create such overlaps.
  • Page 40: Virtual Fabrics With Psm_Mpi Virtual Fabric Enabled

    3–InfiniBand Cluster Setup and Administration ® QLogic Distributed Subnet Administration Figure 3-6. Virtual Fabrics with PSM_MPI Virtual Fabric Enabled Figure 3-6, the administrator enabled the “PSM_MPI” fabric, and then added a new “Reserved” fabric that uses one of the SID ranges that “PSM_MPI” uses.
  • Page 41: Distributed Sa Configuration File

    Second, the Distributed SA handles overlaps by taking advantage of the fact that Virtual Fabrics have unique numeric indexes. These indexes are assigned by the QLogic Fabric Manager in the order which the Virtual Fabrics appear in the configuration file. These indexes can be seen by using the command iba_saquery -o vfinfo command.
  • Page 42: Sid

    The SIDs identify applications which will use the distributed SA to determine their path records. The default configuration for the Distributed SA includes all the SIDs defined in the default Qlogic Fabric Manager configuration for use by MPI.
  • Page 43: Dbg

    Generally, this will produce too much information for normal use. (Includes Dbg=5)  Dbg=7: Debugging This should only be turned on at the request of QLogic Support. This will generate so much information that system operation will be impacted. (Includes Dbg=6) Other Settings The remaining configuration settings for the Distributed SA are generally only useful in special circumstances and are not needed in normal operation.
  • Page 44: Changing The Mtu Size

    3–InfiniBand Cluster Setup and Administration ® Changing the MTU Size Changing the MTU Size The Maximum Transfer Unit (MTU) size enabled by the IB HCA and set by the driver is 4KB. To see the current MTU size, and the maximum supported by the adapter, type the command: $ ibv_devinfo If the switches are set at 2K MTU size, then the HCA will automatically use this as...
  • Page 45: Managing The Ib_Qib Driver

    This should be executed on every switch and both hemispheres of the 9240s.  For the 12000 switches, refer to the QLogic FastFabric User Guide for externally managed switches, and to the QLogic FastFabric CLI Reference Guide for the internally managed switches.
  • Page 46: Configure The Ib_Qib Driver State

    Start, Stop, or Restart ib_qib Driver Restart the software if you install a new QLogic OFED+ Host Software release, change driver options, or do manual testing. QLogic recommends using /etc/init.d/openibd to stop, stat and restart the ib_qib driver.
  • Page 47: Unload The Driver/Modules Manually

    3–InfiniBand Cluster Setup and Administration ® Managing the ib_qib Driver You can check to see if is configured to autostart by using the following opensmd command (as a root user); if there is no output, is not configured to opensmd autostart: # /sbin/chkconfig --list opensmd | grep -w on Unload the Driver/Modules Manually...
  • Page 48: More Information On Configuring And Loading Drivers

    3–InfiniBand Cluster Setup and Administration ® More Information on Configuring and Loading Drivers /ipathfs/1/counter_names /ipathfs/1/counters file contains general driver statistics. There is one numbered driver_stats subdirectory per IB device on the system. Each numbered subdirectory contains the following per-device files: ...
  • Page 49: Performance Tuning

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips Performance Tuning Tuning compute or storage (client or server) nodes with IB HCAs for MPI and verbs performance can be accomplished in several ways:  Run the ipath_perf_tuning script in automatic mode (See “Performance Tuning using ipath_perf_tuning Tool”...
  • Page 50 3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips If cpuspeed or powersaved are being used as part of implementing Turbo modes to increase CPU speed, then they can be left on. With these daemons left on, IB micro-benchmark performance results may be more variable from run-to-run.
  • Page 51: Krcvqs Parameter Settings

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips Increasing the number of kernel receive queues allows more CPU cores to be involved in the processing of verbs traffic. This is important when using parallel file systems such as Lustre or IBM's GPFS (General Parallel File System). The module parameter that sets this number is krcvqs.
  • Page 52: Amd Cpu Systems

    CPUs: options ib_qib pcie_caps=0x51 numa_aware=1 On AMD systems, the pcie_caps=0x51 setting will result in a line of the lspci -vv output associated with the QLogic HCA reading in the "DevCtl" section: MaxPayload 128 bytes, MaxReadReq 4096 bytes. AMD Interlagos CPU Systems...
  • Page 53 3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips For setting all C-States to 0 where there is no BIOS support: Add kernel boot option using the following command: processor.max_cstate=0 Reboot the system. If the node uses a single-port HCA, and is not a part of a parallel file system cluster, there is no need for performance tuning changes to a modprobe configuration file.
  • Page 54: High Risk Tuning For Intel Harpertown Cpus

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips High Risk Tuning for Intel Harpertown CPUs For tuning the Harpertown generation of Intel Xeon CPUs that entails a higher risk factor, but includes a bandwidth benefit, the following can be applied: For nodes with Intel Harpertown, Xeon 54xx CPUs, you can add pcie_caps=0x51 and pcie_coalesce=1 to the modprobe.conf file.
  • Page 55: Additional Driver Module Parameter Tunings Available

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips Additional Driver Module Parameter Tunings Available Setting driver module parameters on Per-unit or Per-port basis The ib_qib driver allows the setting of different driver parameter values for the individual HCAs and ports. This allows the user to specify different values for each port on a HCA or different values for each HCA in the system.
  • Page 56 3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips  value is the parameter value for the particular unit or port. The fields in the square brackets are options; however, either a default or a per-unit/per-port value is required. Example usage: To set the default IB MTU to 1K for all ports on all units: ibmtu=3...
  • Page 57 3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips This command lets the driver automatically decide on the allocation behavior and disables this feature on platforms with AMD and Intel Westmere-or-earlier CPUs, while enabling it on newer Intel CPUs. Tunable options: option ib_qib numa_aware=0 This command disables the NUMA awareness when allocating memories...
  • Page 58: Performance Tuning Using Ipath_Perf_Tuning Tool

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips For example: # cat /etc/modprobe.d/ib_ipoib.conf alias ib0 ib_ipoib alias ib1 ib_ipoib options ib_ipoib recv_queue_size=512 Performance Tuning using ipath_perf_tuning Tool The ipath_perf_tuning tool is intended to adjust parameters to the IB QIB driver to optimize the IB and application performance.
  • Page 59: Options

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips Table 3-3. Checks Preformed by ipath_perf_tuning Tool Check Type Description Check whether (and which) C-States are enabled. C-States cstates should be turned off for best performance. Check whether certain system services (daemons) are services enabled.
  • Page 60: Automatic Vs. Interactive Mode

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips AUTOMATIC vs. INTERACTIVE MODE The tool performs different functions when running in automatic mode compared to running in the interactive mode. The differences include the node type selection, test execution, and applying the results of the executed tests. Node Type Selection The tool is capable of configuring compute nodes or storage nodes (see Compute...
  • Page 61: Affected Files

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips Table 3-5. Test Execution Modes Test Mode Test is performed in both modes but the user is noti- services fied of running services only if the tool is in interac- tive mode.
  • Page 62: Adapter And Other Settings

    Adapter and Other Settings The following adapter and other settings can be adjusted for better performance. NOTE For the most current information on performance tuning refer to the QLogic OFED+ Host Software Release Notes.  Use an IB MTU of 4096 bytes instead of 2048 bytes, if available, with the QLE7340, and QLE7342.
  • Page 63: Remove Unneeded Services

    3–InfiniBand Cluster Setup and Administration ® Performance Settings and Management Tips Remove Unneeded Services The cluster administrator can enhance application performance by minimizing the set of system services running on the compute nodes. Since these are presumed to be specialized computing appliances, they do not need many of the service daemons normally running on a general Linux computer.
  • Page 64: Host Environment Setup For Mpi

    “Erratic Performance” on page D-10 for more information. Host Environment Setup for MPI After the QLogic OFED+ Host software and the GNU (GCC) compilers have been installed on all the nodes, the host environment can be set up for running MPI programs.
  • Page 65: Configuring For Ssh Using Ssh-Agent

    3–InfiniBand Cluster Setup and Administration ® Host Environment Setup for MPI “Configuring for ssh Using ssh-agent” on page 3-43 shows how an individual user can accomplish the same thing using ssh-agent The example in this section assumes the following:  Both the cluster nodes and the front end system are running the openssh package as distributed in current Linux systems.
  • Page 66 3–InfiniBand Cluster Setup and Administration ® Host Environment Setup for MPI On each of the IB node systems, create or edit the file . You will need to copy the contents of the file /etc/ssh/ssh_known_hosts from to this file (as a single line), /etc/ssh/ssh_host_dsa_key.pub ip-fe and then edit that line to insert...
  • Page 67 3–InfiniBand Cluster Setup and Administration ® Host Environment Setup for MPI At this point, any end user should be able to login to the front end system ip-fe and use to login to any IB node without being prompted for a password or pass phrase.
  • Page 68: Checking Cluster And Software Status

    IB status, link speed, and PCIe bus width can be checked by running the program . Sample usage and output are as follows: ipath_control $ ipath_control -iv QLogic OFED.VERSION yyyy_mm_dd.hh_mm_ss 0: Version: ChipABI VERSION, InfiniPath_QLE7340, InfiniPath1 VERSION, SW Compat 2 0: Serial: RIB0935M31511 LocalBus: PCIe,5000MHz,x8...
  • Page 69: Iba_Opp_Query

    3–InfiniBand Cluster Setup and Administration ® Checking Cluster and Software Status iba_opp_query iba_opp_query is used to check the operation of the Distributed SA. You can run it from any node where the Distributed SA is installed and running, to verify that the replica on that node is working correctly.
  • Page 70: Ibstatus

    3–InfiniBand Cluster Setup and Administration ® Checking Cluster and Software Status rate pkt_life 0x10 preference resv2 resv3 ibstatus Another useful program is that reports on the status of the local HCAs. ibstatus Sample usage and output are as follows: $ ibstatus Infiniband device 'qib0' port 1 status: default gid: fe80:0000:0000:0000:0011:7500:005a:6ad0...
  • Page 71: Ibv_Devinfo

    3–InfiniBand Cluster Setup and Administration ® Checking Cluster and Software Status ibv_devinfo queries RDMA devices. Use the option to see more information. ibv_devinfo Sample usage: $ ibv_devinfo hca_id: qib0 fw_ver: 0.0.0 node_guid: 0011:7500:00ff:89a6 sys_image_guid: 0011:7500:00ff:89a6 vendor_id: 0x1175 vendor_part_id: 29216 hw_ver: board_id: InfiniPath_QLE7280 phys_port_cnt:...
  • Page 72 3–InfiniBand Cluster Setup and Administration ® Checking Cluster and Software Status 3-48 IB0054606-02 A...
  • Page 73: Running Mpi On Qlogic Adapters

    Running MPI on QLogic Adapters This section provides information on using the Message-Passing Interface (MPI) on QLogic IB HCAs. Examples are provided for setting up the user environment, and for compiling and running MPI programs. Introduction The MPI standard is a message-passing library or collection of routines used in distributed-memory parallel programming.
  • Page 74: Installation

    Follow the instructions in the QLogic Fabric Software Installation Guide for installing Open MPI. Newer versions of Open MPI released after this QLogic OFED+ release will not be supported (refer to the OFED+ Host Software Release Notes for version numbers). QLogic does not recommend installing any newer versions of Open MPI.
  • Page 75: Create The Mpihosts File

    (gcc, icc, pgcc, etc. ) to determine what options to use for your application. QLogic strongly encourages using the wrapper compilers instead of attempting to link to the Open MPI libraries manually. This allows the specific implementation of Open MPI to change without forcing changes to linker directives in users' Makefiles.
  • Page 76: Further Information On Open Mpi

    4–Running MPI on QLogic Adapters Open MPI The first choice will use verbs by default, and any with the _qlc string will use PSM by default. If you chose openmpi_gcc_qlc-1.4.3, for example, then the following simple mpirun command would run using PSM:...
  • Page 77: Configuring Mpi Programs For Open Mpi

    F77=mpif77 F90=mpif90 CXX=mpicxx In some cases, the configuration process may specify the linker. QLogic recommends that the linker be specified as mpicc, mpif90, etc. in these cases. This specification automatically includes the correct flags and libraries, rather than trying to configure to pass the flags and libraries explicitly. For example:...
  • Page 78: Portland Group

    4–Running MPI on QLogic Adapters Open MPI The easiest way to use other compilers with any MPI that comes with QLogic OFED+ is to use mpi-selector to change the selected MPI/compiler combination, see “Managing MVAPICH, and MVAPICH2 with the mpi-selector Utility”...
  • Page 79: Compiler And Linker Variables

    Normally MPI jobs are run with each node program (process) being associated with a dedicated QLogic IB adapter hardware context that is mapped to a CPU. If the number of node programs is greater than the available number of hardware contexts, software context sharing increases the number of node programs that can be run.
  • Page 80: Ib Hardware Contexts On The Qdr Ib Adapters

    4–Running MPI on QLogic Adapters Open MPI Table 4-5. Available Hardware and Software Contexts Available Hardware Available Contexts when Adapter Contexts (same as number Software Context Sharing is of supported CPUs) Enabled QLE7342/ QLE7340 The default hardware context/CPU mappings can be changed on the QDR IB Adapters (QLE734x).
  • Page 81: Enabling And Disabling Software Context Sharing

    IB contexts to satisfy the job requirement and try to give a context to each process. When context sharing is enabled on a system with multiple QLogic IB adapter boards (units) and the IPATH_UNIT environment variable is set, the number of IB contexts made available to MPI jobs is restricted to the number of contexts available on that unit.
  • Page 82: Restricting Ib Hardware Contexts In A Batch Environment

    PSM environment variables. Setting PSM_SHAREDCONTEXTS_MAX=8 as a clusterwide default would unnecessarily penalize nodes that are dedicated to running single jobs. QLogic recommends that a per-node setting, or some level of coordination with the job scheduler with setting the environment variable should be used.
  • Page 83: Context Sharing Error Messages

    PSM contexts. Clean up these processes before restarting the job. Running in Shared Memory Mode Open MPI supports running exclusively in shared memory mode; no QLogic adapter is required for this mode of operation. This mode is used for running applications on a single node rather than on a cluster of nodes.
  • Page 84: Mpihosts File Details

    This is a different behavior than MVAPICH or the no-longer-supported QLogic MPI. In the second format, process_count can be different for each host, and is normally the number of available processors on the node.
  • Page 85: Using Open Mpi's Mpirun

    4–Running MPI on QLogic Adapters Open MPI  The command line option -hostfile can be used as shown in the following command line: $mpirun -np n -hostfile mpihosts [other options] program-name or -machinefile is a synonym for -hostfile. In this case, if the named file cannot be opened, the MPI job fails.
  • Page 86: Console I/O In Open Mpi Programs

    4–Running MPI on QLogic Adapters Open MPI This option spawns n instances of program-name. These instances are called node programs. Generally, mpirun tries to distribute the specified number of processes evenly among the nodes listed in the hostfile. However, if the number of processes exceeds the number of nodes listed in the hostfile, then some nodes will be assigned more than one instance of the program.
  • Page 87: Environment For Node Programs

    4–Running MPI on QLogic Adapters Open MPI NOTE The node that invoked mpirun need not be the same as the node where the MPI_COMM_WORLD rank 0 process resides. Open MPI handles the redirection of mpirun's standard input to the rank 0 process.
  • Page 88: Exported Environment Variables

    4–Running MPI on QLogic Adapters Open MPI Open MPI adds the base-name of the current node’s bindir (the directory where Open MPI’s executables are installed) to the prefix and uses that to set the PATH on the remote node. Similarly, Open MPI adds the base-name of the current node’s libdir (the directory where Open MPI’s libraries are installed) to the...
  • Page 89: Setting Mca Parameters

    4–Running MPI on QLogic Adapters Open MPI Setting MCA Parameters The -mca switch allows the passing of parameters to various Modular Component Architecture (MCA) modules. MCA modules have direct impact on MPI programs because they allow tunable parameters to be set at run time (such as which BTL communication device driver to use, what parameters to pass to that BTL, and so on.).
  • Page 90: Environment Variables

    4–Running MPI on QLogic Adapters Open MPI Environment Variables Table 4-6 contains a summary of the environment variables that are relevant to any PSM including Open MPI. Table 4-7 is more relevant for the MPI programmer or script writer, because these variables are only active after the mpirun command has been issued and while the MPI processes are active.
  • Page 91 4–Running MPI on QLogic Adapters Open MPI Table 4-6. Environment Variables Relevant for any PSM (Continued) Name Description When set to 1, the PSM library will skip trying to IPATH_NO_CPUAFFINITY set processor affinity. This is also skipped if the processor affinity mask is set to a list smaller than the number of processors prior to MPI_Init() being called.
  • Page 92: Job Blocking In Case Of Temporary Ib Link Failures

    4–Running MPI on QLogic Adapters Open MPI Table 4-6. Environment Variables Relevant for any PSM (Continued) Name Description This variable specifies the path to the run-time LD_LIBRARY_PATH library. Default: Unset Table 4-7. Environment Variables Relevant for Open MPI Name Description...
  • Page 93: Open Mpi And Hybrid Mpi/Openmp Applications

    4–Running MPI on QLogic Adapters Open MPI and Hybrid MPI/OpenMP Applications Open MPI and Hybrid MPI/OpenMP Applications Open MPI supports hybrid MPI/OpenMP applications, provided that MPI routines are called only by the master OpenMP thread. This application is called the funneled thread model.
  • Page 94: Debugging Mpi Programs

    4–Running MPI on QLogic Adapters Debugging MPI Programs NOTE With Open MPI, and other PSM-enabled MPIs, you will typically want to turn off PSM's CPU affinity controls so that the OpenMP threads spawned by an MPI process are not constrained to stay on the CPU core of that process, causing over-subscription of that CPU.
  • Page 95 4–Running MPI on QLogic Adapters Debugging MPI Programs NOTE The TotalView debugger can be used with the Open MPI supplied in this ® release. Consult the TotalView documentation for more information: http://www.open-mpi.org/faq/?category=running#run-with-tv IB0054606-02 A 4-23...
  • Page 96 4–Running MPI on QLogic Adapters Debugging MPI Programs 4-24 IB0054606-02 A...
  • Page 97: Using Other Mpis

    With Open MPI 1.4.3 GCC, Intel, Provides some MPI-2 functionality Verbs (one-sided operations and dynamic pro- cesses). Available as part of the QLogic download. Can be managed by mpi-selector. MVAPICH GCC, Intel, Provides MPI-1 functionality. version 1.2 Verbs Available as part of the QLogic download.
  • Page 98: Installed Layout

    By default, the MVAPICH, MVAPICH2, and Open MPI are installed in the following directory tree: /usr/mpi/$compiler/$mpi-mpi_version The QLogic-supplied MPIs precompiled with the GCC, PGI, and the Intel compilers will also have -qlc appended after the MPI version number. For example: /usr/mpi/gcc/openmpi-VERSION-qlc If a prefixed installation location is used, /usr is replaced by $prefix.
  • Page 99: Open Mpi

    Open MPI is an open source MPI-2 implementation from the Open MPI Project. Pre-compiled versions of Open MPI version 1.4.3 that run over PSM and are built with the GCC, PGI, and Intel compilers are available with the QLogic download. Details on Open MPI operation are provided in...
  • Page 100: Further Information On Mvapich

    MVAPICH2 can be managed with the mpi-selector utility, as described in “Managing MVAPICH, and MVAPICH2 with the mpi-selector Utility” on page 5-5. Compiling MVAPICH2 Applications As with Open MPI, QLogic recommends that you use the included wrapper scripts that invoke the underlying compiler (see Table 5-3).
  • Page 101: Running Mvapich2 Applications

     MVAPICH  MVAPICH2 The mpi-selector is an OFED utility that is installed as a part of QLogic OFED+ 1.5.4. Its basic functions include:  Listing available MPI implementations  Setting a default MPI to use (per user or site wide) ...
  • Page 102: Platform Mpi 8

    5–Using Other MPIs Platform MPI 8 The example shell scripts mpivars.sh and mpivars.csh, for registering with mpi-selector, are provided as part of the mpi-devel RPM in $prefix/share/mpich/mpi-selector-{intel, gnu, pgi} directories. For all non-GNU compilers that are installed outside standard Linux search paths, set up the paths so that compiler binaries and runtime libraries can be resolved.
  • Page 103: Compiling Platform Mpi 8 Applications

    5–Using Other MPIs Intel MPI MPI_ICMOD_PSM__PSM_PATH = "^" Compiling Platform MPI 8 Applications As with Open MPI, QLogic recommends that you use the included wrapper scripts that invoke the underlying compiler (see Table 5-4). Table 5-4. Platform MPI 8 Wrapper Scripts...
  • Page 104: Installation

    QLogic OFED+ Host Software package. They can be installed either with the QLogic OFED+ Host Software installation or using the rpm files after the QLogic OFED+ Host Software tar file has been unpacked. For example: Using DAPL 1.2.
  • Page 105 5–Using Other MPIs Intel MPI Using DAPL 2.0. $ rpm -qa | grep dapl dapl-devel-static-2.0.19-1 compat-dapl-1.2.14-1 dapl-2.0.19-1 dapl-debuginfo-2.0.19-1 compat-dapl-devel-static-1.2.14-1 dapl-utils-2.0.19-1 compat-dapl-devel-1.2.14-1 dapl-devel-2.0.19-1 Verify that there is a /etc/dat.conf file. It should be installed by the dapl- RPM. The file dat.conf contains a list of interface adapters supported by uDAPL service providers.
  • Page 106: Compiling Intel Mpi Applications

    Substitute bin if using 32-bit. Compiling Intel MPI Applications As with Open MPI, QLogic recommended that you use the included wrapper scripts that invoke the underlying compiler. The default underlying compiler is GCC, including gfortran. Note that there are more compiler drivers (wrapper...
  • Page 107: Further Information On Intel Mpi

    5–Using Other MPIs Intel MPI uDAPL 1.2: -genv I_MPI_DEVICE rdma:OpenIB-cma uDAPL 2.0: -genv I_MPI_DEVICE rdma:ofa-v2-ib To help with debugging, you can add this option to the Intel mpirun command: TMI: -genv TMI_DEBUG 1 uDAPL: -genv I_MPI_DEBUG 2 Further Information on Intel MPI For more information on using Intel MPI, see: http://www.intel.com/ IB0054606-02 A...
  • Page 108: Improving Performance Of Other Mpis Over Ib Verbs

    5–Using Other MPIs Improving Performance of Other MPIs Over IB Verbs Improving Performance of Other MPIs Over IB Verbs Performance of MPI applications when using an MPI implementation over IB Verbs can be improved by tuning the IB MTU size. NOTE No manual tuning is necessary for PSM-based MPIs, since the PSM layer determines the largest possible IB MTU for each source/destination path.
  • Page 109: Shmem Description And Configuration

    SHMEM is packaged with the QLogic IFS or QLogic OFED+ Host software.Every node in the cluster must have a QLogic IB adapter card and be running RedHat Enterprise Linux (RHEL) 6, 6.1 or 6.2 OS. One or more Message Passing Interface (MPI) implementations are required and Performance Scaled Messaging (PSM) support must be enabled within the MPI.
  • Page 110 6–SHMEM Description and Configuration Installation The -qlc suffix denotes that this is the QLogic PSM version.  MVAPICH version 1.2.0 compiled for PSM. This is provided by QLogic IFS and can be found in the following directories: /usr/mpi/gcc/mvapich-1.2.0-qlc /usr/mpi/intel/mvapich-1.2.0-qlc /usr/mpi/pgi/mvapich-1.2.0-qlc The -qlc suffix denotes that this is the QLogic PSM version.
  • Page 111: Shmem Programs

    6–SHMEM Description and Configuration SHMEM Programs By default QLogic SHMEM is installed with a prefix of /usr/shmem/qlogic into the following directory structure: /usr/shmem/qlogic /usr/shmem/qlogic/bin /usr/shmem/qlogic/bin/mvapich /usr/shmem/qlogic/bin/mvapich2 /usr/shmem/qlogic/bin/openmpi /usr/shmem/qlogic/lib64 /usr/shmem/qlogic/lib64/mvapich /usr/shmem/qlogic/lib64/mvapich2 /usr/shmem/qlogic/lib64/openmpi /usr/shmem/qlogic/include QLogic recommends that /usr/shmem/qlogic/bin is added onto your $PATH.
  • Page 112: Compiling Shmem Programs

    SHMEM library. The shmemcc script automatically determines the correct directories by finding them relative to its own location. The standard directory layout of the QLogic SHMEM software is assumed. The default C compiler is gcc, and can be overridden by specifying a compiler with the $SHMEM_CC environment variable.
  • Page 113: Running Shmem Programs

    There is no need to couple the application binary to a particular MPI, and these symbols will be correctly resolved at run-time. The advantage of this approach is that SHMEM application binaries will be portable across different implementations of the QLogic SHMEM library, including portability over different underlying MPIs. Running SHMEM Programs Using shmemrun The shmemrun script is a wrapper script for running SHMEM programs using mpirun.
  • Page 114: Running Programs Without Using Shmemrun

    The libraries can be found at: $SHMEM_DIR/lib64/$MPI Where $SHMEM_DIR denotes the top-level directory of the SHMEM installation, typically /usr/shmem/qlogic, and $MPI is your choice of MPI (one of mvapich, mvapich2, or openmpi). Additionally, the PSM receive thread and back-trace must be disabled using the...
  • Page 115: Qlogic Shmem Relationship With Mpi

    These binaries are portable across all MPI implementations supported by QLogic SHMEM. This is true of the get/put micro-benchmarks provided by QLogic SHMEM. The desired MPI can be selected at run time simply by placing the desired mpirun on $PATH, or by using the $SHMEM_MPIRUN environment variable.
  • Page 116: Slurm Integration

    MPI implementation. The slurm web pages describe 3 approaches. Please refer to points 1, 2 and 3 on the following web-page: https://computing.llnl.gov/linux/slurm/mpi_guide.html Below are various options for integration of the QLogic SHMEM and slurm. Full Integration This approach fully integrates QLogic SHMEM start-up into slurm and is available when running over MVAPICH2.
  • Page 117: No Integration

    6–SHMEM Description and Configuration Sizing Global Shared Memory The salloc allocates 16 nodes and runs one copy of shmemrun on the first allocated node which then creates the SHMEM processes. shmemrun invokes mpirun, and mpirun determines the correct set of hosts and required number of processes based on the slurm allocation that it is running inside of.
  • Page 118 $SHMEM_SHMALLOC_INIT_SIZE can also be changed to pre-allocate more memory up front rather than dynamically. By default QLogic SHMEM will use the same base address for the symmetric heap across all PEs in the job. This address can be changed using the $SHMEM_SHMALLOC_BASE_ADDR environment variable.
  • Page 119: Progress Model

    SHMEM one-sided operations. Passive progress means that progress on SHMEM one-sided operations can occur without the application needing to call into SHMEM. Active progress is the default mode of operation for QLogic SHMEM. Passive progress can be selected using an environment variable where required.
  • Page 120: Active Progress

    SHMEM, since progress will not occur and the program will hang. Instead, SHMEM applications should use one of the wait synchronization primitives provided by SHMEM. In active progress mode QLogic SHMEM will achieve full performance. Passive Progress...
  • Page 121: Active Versus Passive Progress

    16KB by default. Active versus Passive Progress It is expected that most applications will be run with QLogic SHMEM's active progress mode since this gives full performance. The passive progress mode will typically be used in the following circumstances: ...
  • Page 122 6–SHMEM Description and Configuration Environment Variables Table 6-1. SHMEM Run Time Library Environment Variables (Continued) Environment Variable Default Description Shared memory consistency checks $SHMEM_SHMALLOC_CHECK set for 0 to disable and 1 to enable. These are good checks for correctness but degrade the performance of shmal- loc() and shfree().
  • Page 123: Implementation Behavior

    When the timeout value is reached, the mpirun is killed. This variable is intended for testing use. Implementation Behavior Some SHMEM properties are not fully specified by the SHMEM API specification. This section discusses the behavior for the QLogic SHMEM implementation. IB0054606-02 A 6-15...
  • Page 124 Additional properties of the QLogic SHMEM implementation are:  The QLogic SHMEM implementation makes no guarantees as to the ordering in which the bytes of a put operation are delivered into the remote memory. It is *not* a safe assumption to poll or read certain bytes of the put destination buffer (for example, the last 8 bytes) to look for a change in value and then infer that the entirety of the put has arrived.
  • Page 125: Application Programming Interface

    6–SHMEM Description and Configuration Application Programming Interface  8 byte put to a sync location  Target side:  Wait for the sync location to be written  Now it is safe to make observations on all puts prior to fence ...
  • Page 126: Shmem Application Programming Interface Calls

    6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls General Operations shmem_init start_pes my_pe _my_pe shmem_my_pe num_pes _num_pes shmem_n_pes Symmetric heap shmalloc shmemalign shfree shrealloc Contiguous Put Operations shmem_short_p shmem_int_p shmem_long_p shmem_float_p shmem_double_p shmem_longlong_p shmem_longdouble_p shmem_char_put...
  • Page 127 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls shmem_put shmem_put32 shmem_put64 shmem_put128 shmem_putmem Non-blocking Put Operations shmem_double_put_nb shmem_float_put_nb shmem_int_put_nb shmem_long_put_nb shmem_longdouble_put_nb shmem_longlong_put_nb shmem_put_nb shmem_put32_nb shmem_put64_nb shmem_put128_nb shmem_putmem_nb shmem_short_put_nb Strided Put Operations shmem_double_iput shmem_float_iput shmem_int_iput shmem_iput...
  • Page 128 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls shmem_short_iput Indexed Put Operations shmem_ixput shmem_ixput32 shmem_ixput64 Put and Non-blocking Ordering, Flushing shmem_fence and Completion shmem_quiet shmem_wait_nb shmem_test_nb shmem_poll_nb (same as shmem_test_nb, provided for compatibility) Contiguous Get Operations shmem_short_g...
  • Page 129 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls shmem_get32 shmem_get64 shmem_get128 shmem_getmem Non-blocking Get Operations shmem_double_get_nb shmem_float_get_nb shmem_int_get_nb shmem_long_get_nb shmem_longdouble_get_nb shmem_longlong_get_nb shmem_short_get_nb shmem_get_nb shmem_get32_nb shmem_get64_nb shmem_get128_nb shmem_getmem_nb Strided Get Operations shmem_double_iget shmem_float_iget shmem_int_iget shmem_iget shmem_iget32...
  • Page 130 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls Indexed Get Operations shmem_ixget shmem_ixget32 shmem_ixget64 Barriers barrier shmem_barrier_all shmem_barrier Broadcasts shmem_broadcast shmem_broadcast32 shmem_broadcast64 Concatenation shmem_collect shmem_collect32 shmem_collect64 shmem_fcollect shmem_fcollect32 shmem_fcollect64 Synchronization operations shmem_int_wait shmem_long_wait shmem_longlong_wait shmem_short_wait...
  • Page 131 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls shmem_short_swap shmem_int_swap shmem_long_swap shmem_longlong_swap shmem_swap shmem_short_cswap shmem_int_cswap shmem_long_cswap shmem_longlong_cswap shmem_short_mswap shmem_int_mswap shmem_long_mswap shmem_longlong_mswap shmem_short_inc shmem_int_inc shmem_long_inc shmem_longlong_inc shmem_short_add shmem_int_add shmem_long_add shmem_longlong_add shmem_short_finc shmem_int_finc shmem_long_finc shmem_longlong_finc shmem_short_fadd shmem_int_fadd...
  • Page 132 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls shmem_long_fadd shmem_longlong_fadd Reductions shmem_int_and_to_all shmem_long_and_to_all shmem_longlong_and_to_all shmem_short_and_to_all shmem_int_or_to_all shmem_long_or_to_all shmem_longlong_or_to_all shmem_short_or_to_all shmem_int_xor_to_all shmem_long_xor_to_all shmem_longlong_xor_to_all shmem_short_xor_to_all shmem_double_min_to_all shmem_float_min_to_all shmem_int_min_to_all shmem_long_min_to_all shmem_longdouble_min_to_all shmem_longlong_min_to_all shmem_short_min_to_all shmem_double_max_to_all shmem_float_max_to_all shmem_int_max_to_all shmem_long_max_to_all shmem_longdouble_max_to_all...
  • Page 133 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls shmem_short_max_to_all shmem_complexd_sum_to_all complex collectives are not implemented shmem_complexf_sum_to_all complex collectives are not implemented shmem_double_sum_to_all shmem_float_sum_to_all shmem_int_sum_to_all shmem_long_sum_to_all shmem_longdouble_sum_to_all shmem_longlong_sum_to_all shmem_short_sum_to_all shmem_complexd_prod_to_all complex collectives are not implemented shmem_complexf_prod_to_all complex collectives are not implemented shmem_double_prod_to_all...
  • Page 134 6–SHMEM Description and Configuration Application Programming Interface Table 6-3. SHMEM Application Programming Interface Calls Operation Calls shmem_clear_lock shmem_test_lock Events clear_event set_event wait_event test_event General Operations globalexit (for compatibility) allows any process to abort the job shmem_finalize call to terminate the SHMEM library shmem_pe_accessible tests PE for accessibility shmem_addr_accessible...
  • Page 135: Shmem Benchmark Programs

    SHMEM performance within a single node. The micro-benchmarks have the command line options shown in Table 6-4 Table 6-4. QLogic SHMEM micro-benchmarks options Option Description a log2 of desired alignment for buffers (default = 12)
  • Page 136: Qlogic Shmem Random Access Benchmark Options

    Usage: shmem-rand [options] [list of message sizes]. Message sizes are specified in bytes (default = 8) Options: See Table 6-5 Table 6-5. QLogic SHMEM random access benchmark options Option Description use automatic (NULL) handles for NB ops (default explicit han-...
  • Page 137: Qlogic Shmem All-To-All Benchmark Options

    6–SHMEM Description and Configuration SHMEM Benchmark Programs Table 6-5. QLogic SHMEM random access benchmark options Option Description choose OP from get, getnb, put, putnb -o OP for blocking puts, no quiet every window (this is the default) for blocking puts, use quiet every window...
  • Page 138: Qlogic Shmem Barrier Benchmark Options

    6–SHMEM Description and Configuration SHMEM Benchmark Programs Table 6-6. QLogic SHMEM all-to-all benchmark options Option Description enable communication to local ranks (including self) memory size in MB (default = 8MB): or in KB with a K suffix -m INTEGER[K] use non-pipelined mode for NB ops (default pipelined)
  • Page 139: Qlogic Shmem Reduce Benchmark Options

    6–SHMEM Description and Configuration SHMEM Benchmark Programs Table 6-8. QLogic SHMEM reduce benchmark options Option Description number of barriers between reduces (default 0) -b INTEGER displays the help page outer iterations (default 1) -i INTEGER[K] inner iterations (default 10000) -r INTEGER...
  • Page 140 6–SHMEM Description and Configuration SHMEM Benchmark Programs 6-32 IB0054606-02 A...
  • Page 141: Virtual Fabric Support In Psm

    (vFabric) integration, allowing users to specify IB Service Level (SL) and Partition Key (PKey), or to provide a configured Service ID (SID) to target a vFabric. Support for using IB path record queries to the QLogic Fabric Manager during connection setup is also available, enabling alternative switch topologies such as Mesh/Torus.
  • Page 142: Virtual Fabric Support

    PSM. Sixteen unique Service IDs have been allocated for PSM enabled MPI vFabrics to ease their testing however any Service ID can be used. Refer to the QLogic Fabric Manager User Guide on how to configure vFabrics.
  • Page 143: Using Service Id

    PSM_IB_SERVICE_ID=SID # Service ID to use SL2VL mapping from the Fabric Manager PSM is able to use the SL2VL table as programmed by the QLogic Fabric Manager. Prior releases required manual specification of the SL2VL mapping via an environment variable.
  • Page 144: Verifying Sl2Vl Tables On Qlogic 7300 Series Adapters

    Adapters iba_saquery can be used to get the SL2VL mapping for any given port however, QLogic 7300 series adapters exports the SL2VL mapping via sysfs files. These files are used by PSM to implement the SL2VL tables automatically. The SL2VL tables are per port and available under /sys/class/infiniband/hca name/ports/port #/sl2vl.
  • Page 145: Dispersive Routing

    Dispersive Routing Infiniband uses deterministic routing that is keyed from the Destination LID ® (DLID) of a port. The Fabric Manager programs the forwarding tables in a switch to determine the egress port a packet takes based on the DLID. Deterministic routing can create hotspots even in full bisection bandwidth (FBB) fabrics for certain communication patterns if the communicating node pairs map onto a common upstream link, based on the forwarding tables.
  • Page 146 8–Dispersive Routing Internally, PSM utilizes dispersive routing differently for small and large messages. Large messages are any messages greater-than or equal-to 64K. For large messages, the message is split into message fragments of 128K by default (called a window). Each of these message windows is sprayed across a distinct path between ports.
  • Page 147 8–Dispersive Routing  Static_Dest: The path selection is based on the CPU index of the destination process. Multiple paths can be used if data transfer is to different remote processes within a node. If multiple processes from Node A send a message to a single process on Node B only one path will be used across all processes.
  • Page 148 8–Dispersive Routing IB0054606-02 A...
  • Page 149: Gpxe Setup

    A boot server or http server (can be the same as the DHCP server)   A node to be booted Use a QLE7340 or QLE7342 adapter for the node. The following software is included with the QLogic OFED+ installation software package:  gPXE boot image ...
  • Page 150: Required Steps

    Required Steps Download a copy of the gPXE image. Located at:  The executable to flash the EXPROM on the QLogic IB adapters is located at: /usr/sbin/ipath_exprom  The gPXE driver for QLE7300 series IB adapters (the EXPROM image) is located at: /usr/share/infinipath/gPXE/iba7322.rom...
  • Page 151: Installing Dhcp

    DHCP server runs on a machine that supports IP over IB. NOTE Prior to installing DHCP, make sure that QLogic OFED+ is already installed on your DHCP server. Download and install the latest DHCP server from www.isc.org.
  • Page 152: Configuring Dhcp

    9–gPXE Preparing the DHCP Server in Linux Configuring DHCP From the client host, find the GUID of the HCA by using p1info or look at the GUID label on the IB adapter. Turn the GUID into a MAC address and specify the port of the IB adapter that is going to be used at the end, using b0 for port0 or b1 for port1.
  • Page 153: Netbooting Over Ib

    NOTE The dhcpd and apache configuration files referenced in this example are included as examples, and are not part of the QLogic OFED+ installed software. Your site boot servers may be different, see their documentation for equivalent information.
  • Page 154 9–gPXE Netbooting Over IB Install Apache. Create an images.conf file and a kernels.conf file and place them in the /etc/httpd/conf.d directory. This sets up aliases for and tells apache where to find them: /images — http://10.252.252.1/images/ /kernels — http://10.252.252.1/kernels/ The following is an example of the images.conf file Alias /images /vault/images <Directory "/vault/images">...
  • Page 155 To add an IB driver into the initrd file, The IB modules need to be copied to the diskless image. The host machine needs to be pre-installed with the QLogic OFED+ Host Software that is appropriate for the kernel version the diskless image will run. The QLogic OFED+ Host Software is available for download from http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/default.aspx...
  • Page 156 9–gPXE Netbooting Over IB The infinipath rpm will install the file /usr/share/infinipath/gPXE/gpxe-qib-modify-initrd with contents similar to the following example. You can either run the script to generate a new initrd image, or use it as an example, and customize as appropriate for your site. # This assumes you will use the currently running version of linux, and # that you are starting from a fully configured machine of...
  • Page 157 9–gPXE Netbooting Over IB # extract previous contents gunzip -dc ../initrd-ib-${kern}.img | cpio --quiet -id # add infiniband modules mkdir -p lib/ib find /lib/modules/${kern}/updates -type f | \ egrep '(iw_cm|ib_(mad|addr|core|sa|cm|uverbs|ucm|umad|ipoib|qib ).ko|rdma_|ipoib_helper)' | \ xargs -I '{}' cp -a '{}' lib/ib # Some distros have ipoib_helper, others don't require it if [ -e lib/ib/ipoib_helper ];...
  • Page 158 9–gPXE Netbooting Over IB IFS=' ' v6cmd='/sbin/insmod /lib/'${xfrm}'.ko '"$v6cmd" crypto=$(modinfo -F depends $xfrm) if [ ${crypto} ]; then cp $(find /lib/modules/$(uname -r) -name ${crypto}.ko) lib IFS=' ' v6cmd='/sbin/insmod /lib/'${crypto}'.ko '"$v6cmd" # we need insmod to load the modules; if not present it, copy it mkdir -p sbin grep -q insmod ../Orig-listing || cp /sbin/insmod sbin...
  • Page 159 9–gPXE Netbooting Over IB /sbin/insmod /lib/ib/ib_sa.ko /sbin/insmod /lib/ib/ib_cm.ko /sbin/insmod /lib/ib/ib_uverbs.ko /sbin/insmod /lib/ib/ib_ucm.ko /sbin/insmod /lib/ib/ib_umad.ko /sbin/insmod /lib/ib/iw_cm.ko /sbin/insmod /lib/ib/rdma_cm.ko /sbin/insmod /lib/ib/rdma_ucm.ko $dcacmd /sbin/insmod /lib/ib/ib_qib.ko $helper_cmd /sbin/insmod /lib/ib/ib_ipoib.ko echo "finished loading IB modules" # End of IB module block # first get line number where we append (after last insmod if any, otherwse # at start line=$(egrep -n insmod init | sed -n '$s/:.*//p')
  • Page 160 9–gPXE Netbooting Over IB # and show the differences. echo -e '\nChanges in files in initrd image\n' diff Orig-listing New-listing # copy the new initrd to wherever you have configure the dhcp server to look # for it (here we assume it's /images) mkdir -p /images initrd-${kern}.img /images echo -e '\nCompleted initrd for IB'...
  • Page 161 9–gPXE Netbooting Over IB The following is an example of a uniboot.php file: <? header ( 'Content-type: text/plain' ); function strleft ( $s1, $s2 ) { return substr ( $s1, 0, strpos ( $s1, $s2 ) ); function baseURL() { $s = empty ( $_SERVER["HTTPS"] ) ? '' : ( $_SERVER["HTTPS"] == "on"...
  • Page 162: Steps On The Gpxe Client

    9–gPXE HTTP Boot Setup This is the kernel that will boot. This file can be copied from any machine that has RHEL5.3 installed. Start httpd Steps on the gPXE Client Ensure that the HCA is listed as the first bootable device in the BIOS. Reboot the test node(s) and enter the BIOS boot setup.
  • Page 163 9–gPXE HTTP Boot Setup Create an images.conf file and a kernels.conf file using the examples in Step 2 Boot Server Setup and place them in the /etc/httpd/conf.d directory. Edit /etc/dhcpd.conf file to boot the clients using HTTP filename "http://172.26.32.9/images/uniboot/uniboot.php"; Restart the DHCP server Start HTTP if it is not already running: /etc/init.d/httpd start IB0054606-02 A...
  • Page 164 9–gPXE HTTP Boot Setup 9-16 IB0054606-02 A...
  • Page 165: Benchmark Programs

    They are not representations of actual IB performance characteristics. For additional MPI sample applications refer to Section 5 of the QLogic FastFabric Command Line Interface Reference Guide. Benchmark 1: Measuring MPI Latency Between...
  • Page 166 A–Benchmark Programs Benchmark 1: Measuring MPI Latency Between Two Nodes The program osu_latency, from Ohio State University, measures the latency for a range of messages sizes from 0bytes to 4 megabytes. It uses a ping-pong method, where the rank zero process initiates a series of sends and the rank one process echoes them back, using the blocking MPI send and receive calls for all operations.
  • Page 167 A–Benchmark Programs Benchmark 1: Measuring MPI Latency Between Two Nodes -H (or --hosts) allows the specification of the host list on the command line instead of using a host file (with the -m or -machinefile option). Since only two hosts are listed, this implies that two host programs will be started (as if -np 2 were specified).
  • Page 168: Benchmark 2: Measuring Mpi Bandwidth Between Two Nodes

    A–Benchmark Programs Benchmark 2: Measuring MPI Bandwidth Between Two Nodes Benchmark 2: Measuring MPI Bandwidth Between Two Nodes The osu_bw benchmark measures the maximum rate that you can pump data between two nodes. This benchmark also uses a ping-pong mechanism, similar to the osu_latency code, except in this case, the originator of the messages pumps a number of them (64 in the installed version) in succession using the non-blocking MPI_I send function, while the receiving node consumes them as...
  • Page 169 A–Benchmark Programs Benchmark 2: Measuring MPI Bandwidth Between Two Nodes Typical output might look like: # OSU MPI Bandwidth Test v3.1.1 # Size Bandwidth (MB/s) 2.35 4.69 9.38 18.80 34.55 68.89 137.87 265.80 480.19 843.70 1024 1353.48 2048 1984.11 4096 2152.61 8192 2249.00...
  • Page 170: Benchmark 3: Messaging Rate Microbenchmarks

    A–Benchmark Programs Benchmark 3: Messaging Rate Microbenchmarks Benchmark 3: Messaging Rate Microbenchmarks OSU Multiple Bandwidth / Message Rate test (osu_mbw_mr) osu_mbw_mr is a multi-pair bandwidth and message rate test that evaluates the aggregate uni-directional bandwidth and message rate between multiple pairs of processes.
  • Page 171 An Enhanced Multiple Bandwidth / Message Rate test (mpi_multibw) mpi_multibw is a version of osu_mbw_mr which has been enhanced by QLogic to, optionally, run in a bidirectional mode and to scale better on the larger multi-core nodes available today This benchmark is a modified form of the OSU Network-Based Computing Lab’s osu_mbw_mr benchmark (as shown in the...
  • Page 172 A–Benchmark Programs Benchmark 3: Messaging Rate Microbenchmarks  N/2 is dynamically calculated at the end of the run.  You can use the -b option to get a bidirectional message rate and bandwidth results.  Scalability has been improved for larger core-count nodes. IB0054606-02 A...
  • Page 173: Mpi_Multibw

    Thefollowing is an example output when running mpi_multibw: $ mpirun -H host1,host2 -npernode 12 /usr/mpi/gcc/openmpi-1.4.3-qlc/tests/qlogic/mpi_multibw # PathScale Modified OSU MPI Bandwidth Test (OSU Version 2.2, PathScale $Revision: 1.1.2.1 $) # Running on 12 procs per node (uni-directional traffic for...
  • Page 174 The following is an example output when running with the bidirectional option (-b): $ mpirun -H host1,host2 -np 24 /usr/mpi/gcc/openmpi-1.4.3-qlc/tests/qlogic/mpi_multibw -b # PathScale Modified OSU MPI Bandwidth Test (OSU Version 2.2, PathScale $Revision: 1.1.2.1 $) # Running on 12 procs per node (bi-directional traffic for...
  • Page 175 A–Benchmark Programs Benchmark 3: Messaging Rate Microbenchmarks The higher peak bi-directional messaging rate of 34.6 million messages per second at the 1 byte size, compared to 25 million messages/sec. when run unidirectionally. IB0054606-02 A A-11...
  • Page 176 A–Benchmark Programs Benchmark 3: Messaging Rate Microbenchmarks A-12 IB0054606-02 A...
  • Page 177: Srp Configuration

    SRP Upper Layer Protocol (ULP). SRP storage can be treated as another device. In this release, two versions of SRP are available: QLogic SRP and OFED SRP. QLogic SRP is available as part of the QLogic OFED Host Software, QLogic IFS, Rocks Roll, and Platform PCM downloads.
  • Page 178: Qlogic Srp Configuration

    OFED modules. The Linux kernel will not allow those OFED modules to be unloaded. QLogic SRP Configuration The QLogic SRP is installed as part of the QLogic OFED+ Host Software or the QLogic IFS. The following section provide procedures to set up and configure the QLogic SRP.
  • Page 179: Stopping, Starting And Restarting The Srp Driver

    B–SRP Configuration QLogic SRP Configuration Stopping, Starting and Restarting the SRP Driver To stop the qlgc_srp driver, use the following command: /etc/init.d/qlgc_srp stop To start the qlgc_srp driver, use the following command: /etc/init.d/qlgc_srp start To restart the qlgc_srp driver, use the following command: /etc/init.d/qlgc_srp restart...
  • Page 180 B–SRP Configuration QLogic SRP Configuration By the port GUID of the IOC, or By the IOC profile string that is created by the VIO device (i.e., a string containing the chassis GUID, the slot number and the IOC number). FVIC creates the device in this manner, other devices have their own naming method.
  • Page 181 B–SRP Configuration QLogic SRP Configuration The system returns input similar to the following: st187:~/qlgc-srp-1_3_0_0_1 # ib_qlgc_srp_query QLogic Corporation. Virtual HBA (SRP) SCSI Query Application, version 1.3.0.0.1 1 IB Host Channel Adapter present in system. HCA Card 0 : 0x0002c9020026041c Port 1 GUID...
  • Page 182: Determining The Values To Use For The Configuration

    B–SRP Configuration QLogic SRP Configuration 0x0000494353535250 service 3 : name SRP.T10:0000000000000004 id 0x0000494353535250 Target Path(s): HCA 0 Port 1 0x0002c9020026041d -> Target Port GID 0xfe8000000000000000066a21dd000021 HCA 0 Port 2 0x0002c9020026041e -> Target Port GID 0xfe8000000000000000066a21dd000021 SRP IOC Profile : Chassis 0x00066A0050000135, Slot 5, IOC 1...
  • Page 183 # qlgc_srp.cfg file generated by /usr/sbin/ib_qlgc_srp_build_cfg, version 1.3.0.0.17, on Mon Aug 25 13:42:16 EDT 2008 #Found QLogic OFED SRP registerAdaptersInOrder: ON ============================================================= # IOC Name: BC2FC in Chassis 0x0000000000000000, Slot 6, Ioc 1 # IOC GUID: 0x00066a01e0000149 SRP IU SIZE : 320 service 0 : name SRP.T10:0000000000000001 id...
  • Page 184: Specifying An Srp Initiator Port Of A Session By Card And Port Indexes

    B–SRP Configuration QLogic SRP Configuration noverify: 0 description: "SRP Virtual HBA 0" command creates a configuration file based on ib_qlgc_srp_build_cfg discovered target devices. By default, the information is sent to . In order stdout to create a configuration file, output should be redirected to a disk file. Enter -h for a list and description of the option flags.
  • Page 185: Specifying A Srp Target Port

    B–SRP Configuration QLogic SRP Configuration NOTE When using this method, if the port GUIDs are changed, they must also be changed in the configuration file. Specifying a SRP Target Port The SRP target can be specified in two different ways. To connect to a particular SRP target no matter where it is in the fabric, use the first method (By IOCGUID).
  • Page 186: Specifying A Srp Target Port Of A Session By Iocguid

    B–SRP Configuration QLogic SRP Configuration Specifying a SRP Target Port of a Session by IOCGUID The following example specifies a target by IOC GUID: session begin card: 0 port: 1 targetIOCGuid: 0x00066A013800016c #IOC GUID of the InfiniFibre port  0x00066a10dd000046 ...
  • Page 187: Restarting The Srp Module

    B–SRP Configuration QLogic SRP Configuration Restarting the SRP Module For changes to take effect, including changes to the SRP map on the VIO card, SRP will need to be restarted. To restart the driver, use the following qlgc_srp command: /etc/init.d/qlgc_srp restart Configuring an Adapter with Multiple Sessions Each adapter can have an unlimited number of sessions attached to it.
  • Page 188 B–SRP Configuration QLogic SRP Configuration When the module encounters an adapter command, that adapter is qlgc_srp assigned all previously defined sessions (that have not been assigned to other adapters). This makes it easy to configure a system for multiple SRP adapters.
  • Page 189: Configuring Fibre Channel Failover

    B–SRP Configuration QLogic SRP Configuration adapter begin description: "Test Device 1" Configuring Fibre Channel Failover Fibre Channel failover is essentially failing over from one session in an adapter to another session in the same adapter. Following is a list of the different type of failover scenarios: ...
  • Page 190: Failover Configuration File 1: Failing Over From One Srp Initiator Port To Another

    B–SRP Configuration QLogic SRP Configuration Failover Configuration File 1: Failing over from one SRP Initiator port to another In this failover configuration file, the first session (using adapter Port 1) is used to reach the SRP Target Port. If a problem is detected in this session (e.g., the IB cable on port 1 of the adapter is pulled) then the 2nd session (using adapter Port 2) will be used.
  • Page 191: Failover Configuration File 2: Failing Over From A Port On The Vio Hardware Card To Another Port On The Vio Hardware Card

    B–SRP Configuration QLogic SRP Configuration adapterIODepth: 1000 lunIODepth: 16 adapterMaxIO: 128 adapterMaxLUNs: 512 adapterNoConnectTimeout: 60 adapterDeviceRequestTimeout: 2 # set to 1 if you want round robin load balancing roundrobinmode: 0 # set to 1 if you do not want target connectivity...
  • Page 192: Failover Configuration File 3: Failing Over From A Port On A Vio Hardware Card To A Port On A Different Vio Hardware Card Within The Same Virtual I/O Chassis

    B–SRP Configuration QLogic SRP Configuration On the VIO hardware side, the following needs to be ensured:  The target device is discovered and configured for each of the ports that is involved in the failover.  The SRP Initiator is discovered and configured once for each different initiatorExtension.
  • Page 193: Failover Configuration File 4: Failing Over From A Port On A Vio Hardware Card To A Port On A Different Vio Hardware Card In A Different Virtual I/O Chassis

    B–SRP Configuration QLogic SRP Configuration On the VIO hardware side, the following need to be ensured on each FVIC involved in the failover:  The target device is discovered and configured through the appropriate FC port  The SRP Initiator is discovered and configured once for the proper initiatorExtension.
  • Page 194: Configuring Fibre Channel Load Balancing

    B–SRP Configuration QLogic SRP Configuration  The target device is discovered and configured through the appropriate FC port  The SRP Initiator is discovered and configured once for the proper initiatorExtension.  The SRP map created for the initiator connects to the same target...
  • Page 195: Adapter Ports And 2 Ports On A Single Vio Module

    B–SRP Configuration QLogic SRP Configuration 2 Adapter Ports and 2 Ports on a Single VIO Module In this example, traffic is load balanced between adapter Port 2/VIO hardware Port 1 and adapter Port1/VIO hardware Port 1. If one of the sessions goes down (due to an IB cable failure or an FC cable failure), all traffic will begin using the other session.
  • Page 196: Using The Roundrobinmode Parameter

    B–SRP Configuration QLogic SRP Configuration Using the roundrobinmode Parameter In this example, the two sessions use different VIO hardware cards as well as different adapter ports. Traffic will be load-balanced between the two sessions. If there is a failure in one of the sessions (e.g., one of the VIO hardware cards is rebooted) traffic will begin using the other session.
  • Page 197: Configuring Srp For Native Ib Storage

    B–SRP Configuration QLogic SRP Configuration Configuring SRP for Native IB Storage Review ib_qlgc_srp_query QLogic Corporation. Virtual HBA (SRP) SCSI Query Application, version 1.3.0.0.1 1 IB Host Channel Adapter present in system. HCA Card 1 : 0x0002c9020026041c Port 1 GUID : 0x0002c9020026041d...
  • Page 198 B–SRP Configuration QLogic SRP Configuration Edit to add this information. /etc/sysconfig/qlgc_srp.cfg service : name SRP.T10:0000000000000001 id 0x0000494353535250 session begin card: 0 port: 1 #portGuid: 0x0002c903000010f1 initiatorExtension: 1 targetIOCGuid: 0x00066a01e0000149 targetIOCProfileIdString: “ Native IB Storage SRP Driver” targetPortGid: 0xfe8000000000000000066a01e0000149 targetExtension: 0x0000000000000001...
  • Page 199: Notes

    B–SRP Configuration QLogic SRP Configuration roundrobinmode: 0 # set to 1 if you do not want target connectivity verification noverify: 0 description: "SRP Virtual HBA 0" Note the correlation between the output of ib_qlgc_srp_query and qlgc_srp.cfg Target Path(s): HCA 0 Port 1 0x0002c9020026041d -> Target Port GID 0xfe8000000000000000066a11dd000021 HCA 0 Port 2 0x0002c9020026041e ->...
  • Page 200: Additional Details

    B–SRP Configuration OFED SRP Configuration Additional Details  All LUNs found are reported to the Linux SCSI mid-layer. Linux may need the (2.4 kernels) or (2.6 kernels)  max_scsi_luns max_luns parameter configured in scsi_mod Troubleshooting For troubleshooting information, refer to “Troubleshooting SRP Issues”...
  • Page 201 B–SRP Configuration OFED SRP Configuration Choose the device you want to use, and run the command again with the option (as a root user): # ibsrpdm -c id_ext=200400A0B8114527,ioc_guid=0002c90200402c04,dgid=fe 800000000000000002c90200402c05,pkey=ffff,service_id=20040 0a0b8114527 id_ext=200500A0B8114527,ioc_guid=0002c90200402c0c,dgid=fe 800000000000000002c90200402c0d,pkey=ffff,service_id=20050 0a0b8114527 id_ext=21000001ff040bf6,ioc_guid=21000001ff040bf6,dgid=fe 8000000000000021000001ff040bf6,pkey=ffff,service_id=f60b0 4ff01000021 Find the result that corresponds to the target you want, and it into the echo file:...
  • Page 202 B–SRP Configuration OFED SRP Configuration Notes B-26 IB0054606-02 A...
  • Page 203: Integration With A Batch Queuing System

    QLogic interconnect. The easiest way to do this is with the fuser command, which is normally installed in /sbin.
  • Page 204: Clean-Up Psm Shared Memory Files

    # /sbin/fuser -v /dev/ipath* lsof can also take the same form: # lsof /dev/ipath* The following command terminates all processes using the QLogic interconnect: # /sbin/fuser -k /dev/ipath For more information, see the man pages for fuser(1) and lsof(8). NOTE...
  • Page 205 C–Integration with a Batch Queuing System Clean-up PSM Shared Memory Files #!/bin/sh files=`/bin/ls /dev/shm/psm_shm.* 2> /dev/null`; for file in $files; /sbin/fuser $file > /dev/null 2>&1; if [ $? -ne 0 ]; then /bin/rm $file > /dev/null 2>&1; done; When the system is idle, the administrators can remove all of the shared memory files, including stale files, by using the following command: # rm -rf /dev/shm/psm_shm.* IB0054606-02 A...
  • Page 206 C–Integration with a Batch Queuing System Clean-up PSM Shared Memory Files IB0054606-02 A...
  • Page 207: Troubleshooting

     System Administration Troubleshooting Performance Issues   Open MPI Troubleshooting Troubleshooting information for hardware installation is found in the QLogic InfiniBand Adapter Hardware Installation Guide and software installation is found ® in the QLogic InfiniBand Fabric Software Installation Guide.
  • Page 208: Bios Settings

    D–Troubleshooting BIOS Settings Table D-1. LED Link and Data Indicators (Continued) LED States Indication Green ON Signal detected and the physical link is up. Ready to talk to SM to bring the link fully up. Amber OFF If this state persists, the SM may be missing or the link may not be configured.
  • Page 209: Driver Load Fails Due To Unsupported Kernel

    If you upgrade the kernel, then you must reboot and then rebuild or reinstall the InfiniPath kernel modules (drivers). QLogic recommends using the IFS Software Installation TUI to preform this rebuild or reinstall. Refer to the QLogic Fabric Software Installation Guide for more information.
  • Page 210: Openfabrics Load Errors If Ib_Qib Driver Load Fails

    D–Troubleshooting Kernel and Initialization Issues A zero count in all CPU columns means that no InfiniPath interrupts have been delivered to the processor. The possible causes of this problem are:  Booting the Linux kernel with ACPI disabled on either the boot command line or in the BIOS configuration ...
  • Page 211: Infinipath Ib_Qib Initialization Failure

    If the driver loaded, but MPI or other programs are not working, check to see if problems were detected during the driver and QLogic hardware initialization with the command: $ dmesg | grep -i ib_qib This command may generate more than one screen of output.
  • Page 212: Mpi Job Failures Due To Initialization Problems

    Managers) and InfiniPath. Stop Infinipath Services Before Stopping/Restarting InfiniPath The following Infinipath services must be stopped before stopping/starting/restarting InfiniPath:  QLogic Fabric Manager  OpenSM  Here is a sample command and the corresponding error messages: # /etc/init.d/openibd stop Unloading infiniband modules: sdp cm umad uverbs ipoib sa ipath mad coreFATAL:Module ib_umad is in use.
  • Page 213: Manual Shutdown Or Restart May Hang If Nfs In Use

    D–Troubleshooting OpenFabrics and InfiniPath Issues Manual Shutdown or Restart May Hang if NFS in Use If you are using NFS over IPoIB and use the manual /etc/init.d/openibd stop (or restart) command, the shutdown process may silently hang on the fuser command contained within the script. This is because fuser cannot traverse down the tree from the mount point once the mount point has disappeared.
  • Page 214: Ibsrpdm Command Hangs When Two Host Channel

    /etc/sysconfig/network-scripts/ifcfg-eth2 (for RHEL) /etc/sysconfig/network/ifcfg-eth2 (for SLES) QLogic recommends using the IP over IB protocol (IPoIB-CM), included in the standard OpenFabrics software releases, as a replacement for ipath_ether. System Administration Troubleshooting The following sections provide details on locating problems related to system administration.
  • Page 215: Broken Intermediate Link

    See your switch vendor for more information. QLogic recommends using FastFabric to help diagnose this problem. If FastFabric is not installed in the fabric, there are two diagnostic tools, ibhosts and ibtracert, that may also be helpful. The tool ibhosts lists all the IB nodes that the subnet manager recognizes.
  • Page 216: Erratic Performance

    D–Troubleshooting Performance Issues Erratic Performance Sometimes erratic performance is seen on applications that use interrupts. An example is inconsistent SDP latency when running a program such as netperf. This may be seen on AMD-based systems using the QLE7240 or QLE7280 adapters.
  • Page 217: Immediately Change The Processor Affinity Of An Irq

    D–Troubleshooting Performance Issues This method is not the first choice because, on some systems, there may be two rows of ib_qib output, and you will not know which one of the two numbers to choose. However, if you cannot find $my_irq listed under /proc/irq (Method 1), this type of system most likely has only one line for ib_qib listed in /proc/interrupts, so you can use Method 2.
  • Page 218: Performance Warning If Ib_Qib Shares Interrupts With Eth0

    (for example, not FE80000000000000) based on the Fabric Manager configuration file. The config_generate tool for the Fabric Manager will help generate such files. Refer to the QLogic Fabric Manager User Guide for more information about the config_generate tool. D-12...
  • Page 219: Ulp Troubleshooting

    ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues To verify that an IB host can access an Ethernet system through the EVIC, issue a ping command to the Ethernet system from the IB host. Make certain that the route to the Ethernet system is using the VIO hardware by using the Linux route command on the IB host, then verify that the route to the subnet is using one of the virtual Ethernet interfaces (i.e., an EIOC).
  • Page 220: Verify That The Proper Virtualnic Driver Is Running

    E–ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues Verify that the proper VirtualNIC driver is running Check that a VirtualNIC driver is running by issuing an lsmod command on the IB host. Make sure that the qlgc_vnic is displayed on the list of modules. Following is an example: st186:~ # lsmod Module...
  • Page 221: Verifying That The Host Can Communicate With The I/O Controllers (Iocs) Of The Vio Hardware

    E–ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues Verifying that the host can communicate with the I/O Controllers (IOCs) of the VIO hardware To display the Ethernet VIO cards that the host can see and communicate with, issue the command ib_qlgc_vnic_query. The system returns information similar to the following: IO Unit Info: port LID:...
  • Page 222 E–ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues Chassis 0x00066A00010003F2, Slot 1, IOC 3 service entries: 2 service[ 0]: 1000066a00000003 / InfiniNIC.InfiniConSys.Control:03 service[ 1]: 1000066a00000103 / InfiniNIC.InfiniConSys.Data:03 When ib_qlgc_vnic_query is run with -e option, it reports the IOCGUID information. With the -s option it reports the IOCSTRING information for the Virtual I/O hardware IOCs present on the fabric.
  • Page 223 E–ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues If the host can not see applicable IOCs, there are two things to check. First, verify that the adapter port specified in the eioc definition of the /etc/infiniband/qlgc_vnic.cfg file is active. This is done using the ibv_devinfo commands on the host, then checking the value of state.
  • Page 224: Checking The Interface Definitions On The Host

    E–ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues Another reason why the host might not be able to see the necessary IOCs is that the subnet manager has gone down. Issue an iba_saquery command to make certain that the response shows all of the nodes in the fabric. If an error is returned and the adapter is physically connected to the fabric, then the subnet manager has gone down, and this situation needs to be corrected.
  • Page 225: Verify The Physical Connection Between The Vio Hardware And The Ethernet Network

    E–ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues DEVICE=eioc1 BOOTPROTO=static IPADDR=172.26.48.132 BROADCAST=172.26.63.130 NETMASK=255.255.240.0 NETWORK=172.26.48.0 ONBOOT=yes TYPE=Ethernet Example of ifcfg-eiocx setup for SuSE and SLES systems: BOOTPROTO='static' IPADDR='172.26.48.130' BROADCAST='172.26.63.255' NETMASK='255.255.240.0' NETWORK='172.26.48.0' STARTMODE='hotplug' TYPE='Ethernet' Verify the physical connection between the VIO hardware and the Ethernet network If the interface is displayed in an ifconfig and a ping between the IB host and the Ethernet host is still unsuccessful, verify that the VIO hardware Ethernet ports...
  • Page 226 E–ULP Troubleshooting Troubleshooting VirtualNIC and VIO Hardware Issues There are up to 6 IOC GUIDs on each VIO hardware module (6 for the IB/Ethernet Bridge Module, 2 for the EVIC), one for each Ethernet port. If a VIO hardware module can be seen from a host, the ib_qlgc_vnic_query -s file displays information similar to: EVIC in Chassis 0x00066a000300012a, Slot 19, Ioc 1 EVIC in Chassis 0x00066a000300012a, Slot 19, Ioc 2...
  • Page 227: Troubleshooting Srp Issues

    E–ULP Troubleshooting Troubleshooting SRP Issues Troubleshooting SRP Issues ib_qlgc_srp_stats showing session in disconnected state Problem: If the session is part of a multi-session adapter, ib_qlgc_srp_stats will show it to be in the disconnected state. For example: SCSI Host # : 17 | Mode ROUNDROBIN Trgt Adapter Depth : 1000...
  • Page 228 E–ULP Troubleshooting Troubleshooting SRP Issues : 0x0000000000000000 Completed Receives : 0x00000000000002c0 | Receive Errors : 0x0000000000000000 Connect Attempts : 0x0000000000000000 | Test Attempts : 0x0000000000000000 Total SWUs : 0x00000000000003e8 | Available SWUs : 0x00000000000003e8 Busy SWUs : 0x0000000000000000 | SRP Req Limit : 0x00000000000003e8 SRP Max ITIU : 0x0000000000000140 | SRP Max TIIU...
  • Page 229: Session In 'Connection Rejected' State

    E–ULP Troubleshooting Troubleshooting SRP Issues Solution: Perhaps an interswitch cable has been disconnected, or the VIO hardware is offline, or the Chassis/Slot does not contain a VIO hardware card. Instead of looking at this file, use the ib_qlgc_srp_query command to verify that the desired adapter port is in the active state.
  • Page 230 E–ULP Troubleshooting Troubleshooting SRP Issues Following is an example: SCSI Host # : 17 | Mode ROUNDROBIN Trgt Adapter Depth : 1000 | Verify Target : Yes Rqst Adapter Depth : 1000 | Rqst LUN Depth : 16 Tot Adapter Depth : 1000 | Tot LUN Depth : 16...
  • Page 231 E–ULP Troubleshooting Troubleshooting SRP Issues SWUs : 0x00000000000003e8 Busy SWUs : 0x0000000000000000 | SRP Req Limit : 0x00000000000003e8 SRP Max ITIU : 0x0000000000000140 | SRP Max TIIU : 0x0000000000000140 Host Busys : 0x0000000000000000 | SRP Max SG Used : 0x000000000000000f Session : Session 2 | State...
  • Page 232: Attempts To Read Or Write To Disk Are Unsuccessful

    E–ULP Troubleshooting Troubleshooting SRP Issues Solution 1: The host initiator has not been configured as an SRP initiator on the VIO hardware SRP Initiator Discovery screen. Via Chassis Viewer, bring up the SRP Initiator Discovery screen and either Click on 'Add New' to add a wildcarded entry with the initiator extension to match what is in the session entry in the qlgc_srp.cfg file, or Click on the Start button to discover the adapter port GUID, and then click 'Configure' on the row containing the adapter port GUID and give the entry...
  • Page 233: Four Sessions In A Round-Robin Configuration Are Active

    E–ULP Troubleshooting Troubleshooting SRP Issues Solution: This indicates a problem in the path between the VIO hardware and the target storage device. After an SRP host has connected to the VIO hardware successfully, the host sends a “Test Unit Ready” command to the storage device.
  • Page 234: Which Port Does A Port Guid Refer To

    Which port does a port GUID refer to? Solution: A QLogic HCA Port GUID is of the form 00066appa0iiiiii where pp gives the port number (0 relative) and iiiiiii gives the individual id number of the adapter so 00066a00a0iiiiiii is the port guid of the 1st port of the adapter and 00066a01a0iiiiiii is the port guid of the 2nd port of the adapter.
  • Page 235: How Does The User Find A Hca Port Guid

    E–ULP Troubleshooting Troubleshooting SRP Issues In a failover configuration, if everything is configured correctly, one session will be Active and the rest will be Connected. The transition of a session from Connected to Active will not be attempted until that session needs to become Active, due to the failure of the previously Active session.
  • Page 236 E–ULP Troubleshooting Troubleshooting SRP Issues The system displays information similar to the following: st106:~ # ibv_devinfo -i 1 hca_id: mthca0 fw_ver: 5.1.9301 node_guid: 0006:6a00:9800:6c9f sys_image_guid: 0006:6a00:9800:6c9f vendor_id: 0x066a vendor_part_id: 25218 hw_ver: 0xA0 board_id: SS_0000000005 phys_port_cnt: 2 port: state: PORT_ACTIVE (4) max_mtu: 2048 (4) active_mtu: 2048 (4) sm_lid: 71...
  • Page 237: Need To Determine The Srp Driver Version

    Need to determine the SRP driver version. Solution: To determine the SRP driver version number, enter the command modinfo -d qlgc-srp, which returns information similar to the following: st159:~ # modinfo -d qlgc-srp QLogic Corp. Virtual HBA (SRP) SCSI Driver, version 1.0.0.0.3 IB0054606-02 A E-19...
  • Page 238 E–ULP Troubleshooting Troubleshooting SRP Issues E-20 IB0054606-02 A...
  • Page 239: Write Combining

    Write Combining Introduction Write Combining improves write bandwidth to the QLogic driver by writing multiple words in a single bus transaction (typically 64 bytes). Write combining applies only to x86_64 systems. The x86 Page Attribute Table (PAT) mechanism allocates Write Combining (WC) mappings for the PIO buffers, and is the default mechanism for WC.
  • Page 240: Mtrr Mapping And Write Combining

    Use the ipath_mtrr Script to Fix MTRR Issues QLogic also provides a script, ipath_mtrr, which sets the MTRR registers, enabling maximum performance from the InfiniPath driver. This Python script is available as a part of the InfiniPath software download, and is contained in the infinipath* RPM.
  • Page 241: Verify Write Combining Is Working

    F–Write Combining Verify Write Combining is Working The test results will list any problems, if they exist, and provide suggestions on what to do. To fix the MTRR registers, use: # ipath_mtrr -w Restart the driver after fixing the registers. This script needs to be run after each system reboot.
  • Page 242 F–Write Combining Verify Write Combining is Working Notes IB0054606-02 A...
  • Page 243: Commands And Files

    Use the following items as a checklist for verifying homogeneity. A difference in any one of these items in your cluster may cause problems:  Kernels  Distributions  Versions of the QLogic boards  Runtime and build environments  files from different compilers  Libraries ...
  • Page 244: Restarting Infinipath

    Scans the system and reports hardware and firmware infor- mation about all the HCAs in the system. iba_manage_switch Allows management of externally managed switches (includ- ing 12200, 12200-18, and HP BLc QLogic 4X QDR) without the IFS software. Enables packet capture and subsequent dump to file iba_packet_capture...
  • Page 245 This script gathers the same information contained in board- version, status_str, and version. A Python script that sets the MTRR registers. ipath_mtrr Tests the IB link and bandwidth between two QLogic IB adapt- ipath_pkt_test ers, or, using an IB loopback connector, tests within a single QLogic IB adapter...
  • Page 246: Summary And Descriptions Of Commands

    It is useful for checking for initialization problems. You can check to see if problems were detected during the driver and QLogic hardware initialization with the command: $ dmesg|egrep -i infinipath|qib This command may generate more than one screen of output.
  • Page 247 G–Commands and Files Summary and Descriptions of Commands -S/--sgid GID — Source GID. (Can be in GID (“0x########:0x########”) or inet6 format (“##:##:##:##:##:##:##:##”)) -D/--dgid GID — Destination GID. (Can be in GID (“0x########:0x########”) or inet6 format (“##:##:##:##:##:##:##:##”)) -k/--pkey pkey — Partition Key -i/--sid sid —...
  • Page 248 G–Commands and Files Summary and Descriptions of Commands Sample output: # iba_opp_query --slid 0x31 --dlid 0x75 --sid 0x107 Query Parameters: resv1 0x0000000000000107 dgid sgid dlid 0x75 slid 0x31 flow tclass num_path pkey qos_class rate pkt_life preference resv2 resv3 Using HCA qib0 Result: resv1 0x0000000000000107...
  • Page 249 G–Commands and Files Summary and Descriptions of Commands resv2 resv3 Explanation of Sample Output: This is a simple query, specifying the source and destination LIDs and the desired SID. The first half of the output shows the full “query” that will be sent to the Distributed SA.
  • Page 250 G–Commands and Files Summary and Descriptions of Commands Examples: Query by LID and SID: iba_opp_query -s 0x31 -d 0x75 -i 0x107 iba_opp_query --slid 0x31 --dlid 0x75 --sid 0x107 Queries using octal or decimal numbers: iba_opp_query --slid 061 --dlid 0165 --sid 0407 (using octal numbers) iba_opp_query –slid 49 –dlid 113 –sid 263 (using decimal numbers)
  • Page 251: Iba_Hca_Rev

    G–Commands and Files Summary and Descriptions of Commands iba_hca_rev This command scans the system and reports hardware and firmware information about all the HCAs in the system. Running iba_hca_rev -v(as a root user) produces output similar to the following when run from a node on the IB fabric: # iba_hca_rev -v ###################### st2092...
  • Page 252 G–Commands and Files Summary and Descriptions of Commands [ADAPTER] PSID = MT_0D80120009 pcie_gen2_speed_supported = true adapter_dev_id = 0x673c silicon_rev = 0xb0 gpio_mode1 = 0x0 gpio_mode0 = 0x050e070f gpio_default_val = 0x0502010f [HCA] hca_header_device_id = 0x673c hca_header_subsystem_id = 0x0017 dpdp_en = true eth_xfi_en = true mdio_en_port1 = 0 [IB]...
  • Page 253 G–Commands and Files Summary and Descriptions of Commands port1_sd2_ob_preemp_pre_qdr = 0x0 port2_sd2_ob_preemp_pre_qdr = 0x0 port1_sd3_ob_preemp_pre_qdr = 0x0 port2_sd3_ob_preemp_pre_qdr = 0x0 port1_sd0_ob_preemp_post_qdr = 0x6 port2_sd0_ob_preemp_post_qdr = 0x6 port1_sd1_ob_preemp_post_qdr = 0x6 port2_sd1_ob_preemp_post_qdr = 0x6 port1_sd2_ob_preemp_post_qdr = 0x6 port2_sd2_ob_preemp_post_qdr = 0x6 port1_sd3_ob_preemp_post_qdr = 0x6 port2_sd3_ob_preemp_post_qdr = 0x6 port1_sd0_ob_preemp_main_qdr = 0x0 port2_sd0_ob_preemp_main_qdr = 0x0...
  • Page 254 G–Commands and Files Summary and Descriptions of Commands port2_sd3_muxmain_qdr = 0x1f mellanox_qdr_ib_support = true mellanox_ddr_ib_support = true spec1_2_ib_support = true spec1_2_ddr_ib_support = true spec1_2_qdr_ib_support = true auto_qdr_tx_options = 8 auto_qdr_rx_options = 7 auto_ddr_option_0.tx_preemp_pre = 0x2 auto_ddr_option_0.tx_preemp_msb = 0x1 auto_ddr_option_0.tx_preemp_post = 0x0 auto_ddr_option_0.tx_preemp_main = 0x1b auto_ddr_option_1.tx_preemp_pre = 0x8 auto_ddr_option_1.tx_preemp_msb = 0x0...
  • Page 255 G–Commands and Files Summary and Descriptions of Commands auto_ddr_option_4.tx_preemp = 0x0 auto_ddr_option_5.tx_preemp_pre = 0x5 auto_ddr_option_5.tx_preemp_msb = 0x1 auto_ddr_option_5.tx_preemp_post = 0x3 auto_ddr_option_5.tx_preemp_main = 0x13 auto_ddr_option_5.tx_preemp = 0x0 auto_ddr_option_6.tx_preemp_pre = 0x3 auto_ddr_option_6.tx_preemp_msb = 0x1 auto_ddr_option_6.tx_preemp_post = 0x4 auto_ddr_option_6.tx_preemp_main = 0x1f auto_ddr_option_6.tx_preemp = 0x0 auto_ddr_option_7.tx_preemp_pre = 0x8 auto_ddr_option_7.tx_preemp_msb = 0x1 auto_ddr_option_7.tx_preemp_post = 0x3...
  • Page 256 G–Commands and Files Summary and Descriptions of Commands auto_ddr_option_11.tx_preemp_msb = 0x0 auto_ddr_option_11.tx_preemp_post = 0x3 auto_ddr_option_11.tx_preemp_main = 0x19 auto_ddr_option_11.tx_preemp = 0x0 auto_ddr_option_12.tx_preemp_pre = 0xf auto_ddr_option_12.tx_preemp_msb = 0x0 auto_ddr_option_12.tx_preemp_post = 0x3 auto_ddr_option_12.tx_preemp_main = 0x19 auto_ddr_option_12.tx_preemp = 0x0 auto_ddr_option_13.tx_preemp_pre = 0x0 auto_ddr_option_13.tx_preemp_msb = 0x0 auto_ddr_option_13.tx_preemp_post = 0x0 auto_ddr_option_13.tx_preemp_main = 0x5 auto_ddr_option_13.tx_preemp = 0x0...
  • Page 257 G–Commands and Files Summary and Descriptions of Commands auto_ddr_option_6.rx_offs_lowpass_en = 0x0 auto_ddr_option_7.rx_offs_lowpass_en = 0x0 auto_ddr_option_0.rx_offs = 0x0 auto_ddr_option_1.rx_offs = 0x0 auto_ddr_option_2.rx_offs = 0x0 auto_ddr_option_3.rx_offs = 0x0 auto_ddr_option_4.rx_offs = 0x0 auto_ddr_option_5.rx_offs = 0x0 auto_ddr_option_6.rx_offs = 0x0 auto_ddr_option_7.rx_offs = 0x0 auto_ddr_option_0.rx_equal_offs = 0x0 auto_ddr_option_1.rx_equal_offs = 0x0 auto_ddr_option_2.rx_equal_offs = 0x0 auto_ddr_option_3.rx_equal_offs = 0x0...
  • Page 258 G–Commands and Files Summary and Descriptions of Commands auto_ddr_option_5.rx_main = 0xe auto_ddr_option_6.rx_main = 0xf auto_ddr_option_7.rx_main = 0xf auto_ddr_option_0.rx_extra_hs_gain = 0x0 auto_ddr_option_1.rx_extra_hs_gain = 0x3 auto_ddr_option_2.rx_extra_hs_gain = 0x2 auto_ddr_option_3.rx_extra_hs_gain = 0x4 auto_ddr_option_4.rx_extra_hs_gain = 0x1 auto_ddr_option_5.rx_extra_hs_gain = 0x2 auto_ddr_option_6.rx_extra_hs_gain = 0x7 auto_ddr_option_7.rx_extra_hs_gain = 0x0 auto_ddr_option_0.rx_sigdet_th = 0x1 auto_ddr_option_1.rx_sigdet_th = 0x1 auto_ddr_option_2.rx_sigdet_th = 0x1...
  • Page 259 G–Commands and Files Summary and Descriptions of Commands auto_ddr_option_11.rx_muxeq = 0x04 auto_ddr_option_11.rx_muxmain = 0x1f auto_ddr_option_11.rx_main = 0xf auto_ddr_option_11.rx_extra_hs_gain = 0x4 auto_ddr_option_11.rx_equalization = 0x7f auto_ddr_option_12.rx_muxeq = 0x6 auto_ddr_option_12.rx_muxmain = 0x1f auto_ddr_option_12.rx_main = 0xf auto_ddr_option_12.rx_extra_hs_gain = 0x4 auto_ddr_option_12.rx_equalization = 0x7f auto_ddr_option_13.rx_muxeq = 0x0 auto_ddr_option_13.rx_muxmain = 0x1f auto_ddr_option_13.rx_main = 0xf auto_ddr_option_13.rx_extra_hs_gain = 0x3...
  • Page 260 G–Commands and Files Summary and Descriptions of Commands lbist_shift_freq pll_stabilize = 0x13 flash_div = 0x3 lbist_array_bypass = 1 lbist_pat_cnt_lsb = 0x2 core_f = 44 core_r = 27 ddr_6_db_preemp_pre = 0x3 ddr_6_db_preemp_main = 0xe [FW] Firmware Verification: FS2 failsafe image. Start address: 0x0. Chunk size 0x80000: NOTE: The addresses below are contiguous logical addresses.
  • Page 261: Iba_Manage_Switch

    ###################### iba_manage_switch (Switch) Allows management of externally managed switches (including 12200, 12200-18, and HP BLc QLogic 4X QDR) without using the IFS software. It is designed to operate on one switch at a time, taking a mandatory target GUID parameter.
  • Page 262 G–Commands and Files Summary and Descriptions of Commands linkwidth (link width supported) – use -i for integer value (1=1X, 2=4X, 3=1X/4X, 4=8X, 5=1X/8X, 6=4X/8X, 7=1X/4X/8X) vlcreditdist (VL credit distribution) – use -i for integer value (0, 1, 2, 3, or 4) linkspeed (link speed supported) –...
  • Page 263: Iba_Packet_Capture

    G–Commands and Files Summary and Descriptions of Commands Example iba_manage_switch -t 0x00066a00e3001234 -f QLogic_12000_V1_firmware.7.0.0.0.27.emfw fwUpdate iba_manage_switch -t 0x00066a00e3001234 reboot iba_manage_switch -t 0x00066a00e3001234 showFwVersion iba_manage_switch -t 0x00066a00e3001234 -s i12k1234 setIBNodeDesc iba_manage_switch -t 0x00066a00e3001234 -C mtucap -i 4 setConfigValue iba_manage_switch -H The results are recorded in iba_manage_switch.res file in the current directory.
  • Page 264: Ibhosts

    G–Commands and Files Summary and Descriptions of Commands – number of seconds for alarm trigger to dump capture and exit alarm – max 64 byte blocks of data to capture in units of Mi (1024*1024) maxblocks -v – verbose output To stop capture and trigger dump, kill with SIGINT (Ctrl-C) or SIGUSR1 (with the kill command).
  • Page 265: Ibtracert

    G–Commands and Files Summary and Descriptions of Commands Following is a sample output for the DDR adapters: # ibstatus Infiniband device 'qib0' port 1 status: default gid: fe80:0000:0000:0000:0011:7500:0078:a5d2 base lid: sm lid: state: 4: ACTIVE phys state: 5: LinkUp rate: 40 Gb/sec (4X QDR) link_layer: InfiniBand...
  • Page 266: Ident

    0x00 link_layer: ident The ident strings are available in ib_qib.ko. Running ident provides driver information similar to the following. For QLogic RPMs on a SLES distribution, it will look like the following example: ident/lib/modules/OS_version/updates/kernel/drivers/infiniban d/hw/qib/ib_qib.ko /lib/modules/OS_version/updates/kernel/drivers/infiniband/hw/ qib/ib_qib.ko: $Id: QLogic OFED Release x.x.x $...
  • Page 267: Ipath_Checkout

    G–Commands and Files Summary and Descriptions of Commands NOTE For QLogic RPMs on a RHEL distribution, the drivers folder is in the updates folder instead of the kernels folder as follows: /lib/modules/OS_version/updates/drivers/ infiniband/hw/qib/ib_qib.ko If the /lib/modules/ /updates directory is not present, then the OS_version driver in use is the one that comes with the core kernel.
  • Page 268: G-2 Ipath_Checkout Options

    G–Commands and Files Summary and Descriptions of Commands NOTE  The hostnames in the nodefile are Ethernet hostnames, not IPv4 addresses.  To create a nodefile, use the ibhosts program. It will generate a list of available nodes that are already connected to the switch. ipath_checkout performs the following seven tests on the cluster: Executes the ping command to all nodes to verify that they all are reachable from the front end.
  • Page 269: Ipath_Control

    G–Commands and Files Summary and Descriptions of Commands Table G-2. ipath_checkout Options (Continued) Command Meaning This option keeps intermediate files that were created while -k, --keep performing tests and compiling reports. Results are saved in a directory created by mktemp and named infinipath_XXXXXX or in the directory name given to --workdir.
  • Page 270: Ipath_Mtrr

    G–Commands and Files Summary and Descriptions of Commands Here is sample usage and output: % ipath_control -i $Id: QLogic OFED Release x.x.x $ $Date: yyyy-mm-dd-hh:mm $ 0: Version: ChipABI 2.0, InfiniPath_QLE7342, InfiniPath1 6.1, SW Compat 2 0: Serial: RIB0941C00005 LocalBus: PCIe,5000MHz,x8...
  • Page 271: Ipath_Pkt_Test

    G–Commands and Files Summary and Descriptions of Commands MTRR is used by the InfiniPath driver to enable write combining to the QLogic on-chip transmit buffers. This option improves write bandwidth to the QLogic chip by writing multiple words in a single bus transaction (typically 64 bytes). This option applies only to x86_64 systems.
  • Page 272: Ipathstats

    G–Commands and Files Summary and Descriptions of Commands  Test the IB link and bandwidth between two InfiniPath IB adapters.  Using an IB loopback connector, test the link and bandwidth within a single InfiniPath IB adapter. The ipath_pkt_test program runs in either ping-pong mode (send a packet, wait for a reply, repeat) or in stream mode (send packets as quickly as possible, receive responses as they come back).
  • Page 273: Mpirun

    G–Commands and Files Summary and Descriptions of Commands mpirun mpirun determines whether the program is being run against a QLogic or non-QLogic driver. It is installed from the mpi-frontend RPM. Sample commands and results are shown in the following paragraphs.
  • Page 274: Rpm

    G–Commands and Files Common Tasks and Commands This option poisons receive buffers at initialization and after each receive; pre-initialize with random data so that any parts that are not being correctly updated with received data can be observed later. See the mpi_stress(1) man page for more information. To check the contents of an installed RPM, use these commands: $ rpm -qa infinipath\* mpi-\* $ rpm -q --info infinipath # (etc)
  • Page 275: Common Tasks And Commands Summary

    G–Commands and Files Common Tasks and Commands Table G-3. Common Tasks and Commands Summary Function Command Check the system state ipath_checkout [options] hostsfile ipathbug-helper -m hostsfile \ > ipath-info-allhosts mpirun -m hostsfile -ppn 1 \ -np numhosts -nonmpi ipath_control -i Also see the file: /sys/class/infini- band/ipath*/device/status_str...
  • Page 276: Summary And Descriptions Of Useful Files

    G–Commands and Files Summary and Descriptions of Useful Files Table G-3. Common Tasks and Commands Summary (Continued) Function Command Show the status of host IB ipathbug-helper -m hostsfile \ ports > ipath-info-allhosts mpirun -m hostsfile -ppn 1 \ -np numhosts -nonmpi ipath_control -i Verify that the hosts see each ipath_checkout --run=5 hostsfile other...
  • Page 277: Status_Str File Contents

    G–Commands and Files Summary and Descriptions of Useful Files This information is useful for reporting problems to Technical Support. NOTE This file returns information of where the form factor adapter is installed. The PCIe half-height, short form factor is referred to as the QLE7140, QLE7240, QLE7280, QLE7340, or QLE7342.
  • Page 278: Version

    You can check the version of the installed InfiniPath software by looking in: /sys/class/infiniband/qib0/device/driver/version QLogic-built drivers have contents similar to: $Id: QLogic OFED Release x.x.x$ $Date: Day mmm dd hh:mm:ss timezone yyyy $ Non-QLogic-built drivers (in this case kernel.org) have contents similar to: $Id: QLogic kernel.org driver $...
  • Page 279: Configuration Files

    G–Commands and Files Summary of Configuration Files Table G-7. Configuration Files Configuration File Name Description Specifies options for modules when added /etc/modprobe.conf or removed by the modprobe command. Also used for creating aliases. The PAT write-combing option is set here. For Red Hat 5.X systems.
  • Page 280 G–Commands and Files Summary of Configuration Files G-38 IB0054606-02 A...
  • Page 281: Recommended Reading

    Recommended Reading Reference material for further reading is provided in this appendix. References for MPI The MPI Standard specification documents are located at: http://www.mpi-forum.org/docs The MPICH implementation of MPI and its documentation are located at: http://www-unix.mcs.anl.gov/mpi/mpich/ The ROMIO distribution and its documentation are located at: http://www.mcs.anl.gov/romio Books for Learning MPI Programming Gropp, William, Ewing Lusk, and Anthony Skjellum, Using MPI, Second Edition,...
  • Page 282: Openfabrics

    H–Recommended Reading OpenFabrics OpenFabrics Information about the OpenFabrics Alliance (OFA) is located at: http://www.openfabrics.org Clusters Gropp, William, Ewing Lusk, and Thomas Sterling, Beowulf Cluster Computing with Linux, Second Edition, 2003, MIT Press, ISBN 0-262-69292-9 Networking The Internet Frequently Asked Questions (FAQ) archives contain an extensive Request for Command (RFC) section.
  • Page 284 UK | Ireland | Germany | France | India | Japan | China | Hong Kong | Singapore | Taiwan © 2012 QLogic Corporation. Specifications are subject to change without notice. All rights reserved worldwide. QLogic, the QLogic logo, and the Powered by QLogic logo are registered trademarks of QLogic Corporation.

This manual is also suitable for:

Ofed+ host 1.5.4

Table of Contents