Fibre channel npiv storage networking for windows server 2008 r2 hyper-v and system center vmm2008 r2 usage scenarios and best practices guide (78 pages)
Page 1
S i m p l i f y Fast Fabric Users Guide D000006-000 Rev. A Page i...
Page 2
Fast Fabric Users Guide Page ii D000006-000 Rev. A...
Page 3
QLogic Corporation reserves the right to change product specifications at any time without notice. Applications described in this document for any of these products are for illustrative purposes only. QLogic Corporation makes no representation nor warranty that such applications are suitable for the specified use without further testing or modification. QLogic Corporation assumes no responsibility for any errors that may appear in this document.
Fast Fabric toolset. License Agreements Refer to the QLogic Software End User License Agreement for a complete listing of all license agreements affecting this product. D000006-000 Rev. A...
Visit the QLogic support Web site listed in Contact Information for the latest firmware and software updates. 1.3.1 Availability QLogic Technical Support for products under warranty is available during local standard working hours excluding QLogic Observed Holidays. 1.3.2 Contact Information Support Headquarters QLogic Corporation...
Section 2 Fast Fabric Overview Feature Overview The Fast Fabric Toolset is designed to both simplify and expedite common InfiniBand (IB) cluster management tasks. Fast Fabric can assist in generic management tasks as well as InfiniBand installation, upgrade, configuration and verification tasks.
2 – Fast Fabric Overview Fast Fabric Architecture Aids in ongoing fabric status and configuration monitoring ❥ Automated fabric health checks and configuration baseline compare ❥ Automated chassis health checks and configuration baseline compare ❥ Automated SM health checks and configuration baseline compare ❥...
Depending on cluster size and design, the IB Management node may also be used as the master node for starting MPI jobs. It may also be used to run a QLogic Host SM and other management software. Consult the QLogic SM documentation for details and what combinations are valid.
Page 14
2 – Fast Fabric Overview Fast Fabric Architecture the tools to operate with minimal user interaction and hence reduce the time to perform operations against many hosts or chassis. After initial installation, Fast Fabric can be configured to use IPoIB instead of the management network.
Section 3 Getting Started Before using the Fast Fabric toolset, the Site Implementation Engineer must perform the tasks described in the sections which follow. To aid in keeping track of steps performed a checklist is provided (see appendix A). During the setup procedure, the Fast Fabric configuration files which must be edited or created are described throughout the procedure.
3 – Getting Started Set Up the Fabric Set Up the Fabric 1. (All) The first step in any installation is to physically install the hardware: Servers ❥ Core and leaf InfiniBand switches, such as the SilverStorm 9024 and 9000 ❥...
Page 17
Typically name resolution is accomplished by configuring a DNS server on the management network with both management network and IPoIB addresses for each host (and QLogic internally managed IB chassis). Alternately a /etc/hosts file may be created on the IB Management node. Fast Fabric can then propagate this /etc/hosts file to all the other hosts.
3 – Getting Started Using Fast Fabric The /etc/hosts file should not have any node-specific data (the following section will step through the task of copying this file to all the nodes). If using DNS: Consult the documentation for the DNS server being used. Make sure to edit the /etc/resolv.conf configuration on the IB Management Node to use the proper DNS server.
Page 19
3 – Getting Started SilverStorm Technologies Inc. InfiniBand 4.1.1.0.15 Software 1) Show Installed Software 2) Reconfigure IP over IB 3) Reconfigure Driver Autostart 4) Update HCA Firmware 5) Generate Supporting Information for Problem Report 6) Host Setup via Fast Fabric 7) Host Admin via Fast Fabric 8) Chassis Admin via Fast Fabric 9) Externally Managed Switch Admin via Fast Fabric...
3 – Getting Started Installing and Verifying Firmware on the SilverStorm IB Chassis items have been selected, press P. To unselect all items, press N. Pressing X or ESC will exit this menu and return to the Main Menu. If more than 1 item is selected, the items will be performed in the order shown in the menu.
Page 21
3 – Getting Started a. When Virtual I/O controllers (VIC) are installed in a chassis, each VIC should also be assigned a unique name. 4. (Switch) Configure the administrator password on each SilverStorm Chassis NOTE: Newer versions of SilverStorm chassis firmware permit SSH keys to be configured within the chassis for secure password-less login.
Page 22
Fast Fabric will provide the opportunity to enter the chassis password interactively when needed. Hence it's not necessary to place it within fastfabric.conf. If it is desired to instead keep the QLogic Chassis admin password in fastfabric.conf, its recommended to change the fastfabric.conf permissions to be 0x600 (eg.
3 – Getting Started NOTE: The chassis must be running firmware version 4.0.0.4.3 or later to perform this function. If the chassis is not up to this level, it will need to be manually updated via the chassis GUI. See the SilverStorm 9000 Users Guide for more information.
3 – Getting Started Installing and Verifying Firmware on the IB Switches The steps which follow will require that an SM be operational within the fabric. Installing and Verifying Firmware on the IB Switches If the fabric contains SilverStorm 9024FC series externally managed switches, Fast Fabric may be used to aid the installation and configuration of the switches.
Page 25
3 – Getting Started review all the settings. Refer to appendix B for more information about fastfabric.conf. When placed in the editor for ibnodes, create the file with a list of the switch node guids and desired switch names, one entry per line. Such as: 0x00066a00d9000138,edge1 0x00066a00d9000139,edge2 NOTE:...
3 – Getting Started Installing InfiniBand on the Remaining Servers If any switch fails to be updated, use the "View ibtest result files" option to review the result files from the update. Refer to the section “Interpreting the ibtest log files” on page 5-68 for more details.
Page 27
3 – Getting Started SilverStorm Technologies Inc. IB Host Setup Menu (4.1.1.0.15) Fast Fabric Host List: /etc/sysconfig/iba/hosts 0) Edit Config and Select/Edit Hosts Files [Perform] 1) Verify Hosts via Ethernet ping [Perform] 2) Verify rsh/rcp Configured [ Skip 3) Setup Password-less ssh/scp [Perform] 4) Copy /etc/hosts to all hosts [ Skip...
Page 28
3 – Getting Started Installing InfiniBand on the Remaining Servers When placed in the editor for hosts, create the file with a list of the hosts names (the TCP/IP management network names) except the IB Management node from which Fast Fabric is presently being run, one entry per line. Such as: host1 host2 NOTE:...
Page 29
3 – Getting Started NOTE: If DNS is being used, this step should be skipped. NOTE: Typically, /etc/resolv.conf is setup as part of OS installation for each host. However, if /etc/resolv.conf was not setup on all the hosts during OS installation, the Fast Fabric "Copy a file to all hosts"...
3 – Getting Started Verifying InfiniBand on the Remaining Servers Each time this is executed a Linux shell command (or sequence of commands separated by semicolons) may be specified to be executed against all selected hosts. NOTE: It is recommended at this time to run the "date" command to verify the the date and time is consistent on all hosts.
Page 31
3 – Getting Started mgmthost below for example) and include the hosts file previously created, one entry per line. Such as: mgmthost include /etc/sysconfig/iba/hosts For further details about the file format refer to section “Selection of Hosts” on page 5-3. 4.
3 – Getting Started Complete Installation of additional IB Management Nodes 7. (Host) "Verify Hosts see each other" will verify that each host can see all the others via queries to the Subnet Administrator and the SA replica on each host has been fully populated.
FF_ANALYSIS_DIR, FF_ALL_ANALYSIS, FF_FABRIC_HEALTH, FF_CHASSIS_CMDS,_FF_CHASSIS_HEALTH, and FF_ESM_CMDS. FF_ALL_ANALYSIS should be updated to reflect the type of SM (esm or hostsm). 2. (All) If using Embedded SM(s) in QLogic IB Chassis, create /etc/sysconfig/iba/esm_chassis listing the chassis which are running SMs. D000006-000 Rev A 3-19...
3 – Getting Started Running HPL Create the file with a list of the chassis names (the TCP/IP Ethernet management port names assigned above) or IP addresses (Use of names is recommended). One entry per line. Such as: Chassis1 Chassis2 For further details about the file format refer to the section “Selection of Chassis”...
3 – Getting Started At this point the user is ready to move onto full scale HPL runs. Assorted sample HPL.dat files are provided in /opt/iba/src/mpi_apps/hpl-config. These files are a good starting point for most clusters and should get within 10-20% of the optimal performance for the cluster.
Page 36
3 – Getting Started Upgrading IB software IB Management Nodes should also include the MPI Runtime and MPI Development packages, and if the user desires to rebuild MPI itself, the IB Development package and MPI Source packages will also be required. After completing the install, reboot each of the IB Management Nodes to ensure they are running the new IB software.
Page 37
3 – Getting Started NOTE: Do not list any of IB Management Nodes (eg. The nodes which have fast fabric installed) NOTE: The file may list the Management Network or IPoIB hostnames for the selected hosts 5. (Host) "Install/Upgrade InfiniServ Software" will upgrade the IB software on all the selected hosts.
Section 4 Fast Fabric TUI Menu Fast Fabric is easiest to use from the textual user interface (TUI) menu system. The menu system provides a way to perform all common tasks and presents common options. Additional less common options are available directly via the Command Line Tools documented in the next section.
Page 40
4 – Fast Fabric TUI Menu QuickSilver Fabric Access Software Users Guide. Selecting items 1-9 will display the given submenu. Pressing X will exit the menu system. Selection of a Fast Fabric menu (6-9) will present a submenu such as below: SilverStorm Technologies Inc.
4 – Fast Fabric TUI Menu At the top of each Fast Fabric menu, the file listing the components to operate on is shown. For example: Fast Fabric Host List: /etc/sysconfig/iba/hosts On each Fast Fabric menu, item 0 will permit a different file to be selected and will permit the editing of the file (using the editor selected via the EDITOR environment variable).
4 – Fast Fabric TUI Menu Host Setup via Fast Fabric 0) Edit Config and Select/Edit Hosts Files [ Skip 1) Verify Hosts via Ethernet ping [ Skip 2) Verify rsh/rcp Configured [ Skip 3) Setup Password-less ssh/scp [ Skip 4) Copy /etc/hosts to all hosts [ Skip 5) Show uname -a for all hosts...
4 – Fast Fabric TUI Menu NOTE: It is recommended that SSH be used in place of the check_rsh command. 4.1.4 Setup Password-less SSH/SCP (Linux) This will run the setup_ssh -i "" command. This will setup secure password-less SSH such that the IB Management Node can securely login to all the other hosts as root via the management network without requiring a password.
4 – Fast Fabric TUI Menu Host Setup via Fast Fabric If any hosts fail to be updated, use the View ibtest result files option to review the result files from the update. For more details, see “Interpreting the ibtest log files”...
4 – Fast Fabric TUI Menu 4.1.13 Run a command on all hosts (Linux) This will run the cmdall command. A Linux shell command (or sequence of commands separated by semicolons) may be specified to be executed against all selected hosts. 4.1.14 Copy a file to all hosts (Linux) This will run the scpall command.
4 – Fast Fabric TUI Menu Host Admin via Fast Fabric Host Admin via Fast Fabric This menu is focused on verifying hosts and the fabric as well as administration of all the hosts. SilverStorm Technologies Inc. IB Host Admin Menu (4.1.1.0.15) Fast Fabric Host List: /etc/sysconfig/iba/allhosts 0) Edit Config and Select/Edit Hosts Files [ Skip...
4 – Fast Fabric TUI Menu 4.2.3 Summary of Fabric Components (All) This will run the fabric_info command to provide a brief summary of the counts of components in the fabric including how many switch chips, hosts, and links are in the fabric. It will also indicate if any 1x links were found (that could indicate a poorly seated or bad cable).
4 – Fast Fabric TUI Menu Host Admin via Fast Fabric 4.2.8 Check MPI Performance (Host) This will run the ibtest mpiperf command to do a quick check of PCI and MPI performance. This displays the MPI latency and bandwidth between pairs of hosts (1-2, 3-4, 5-6, etc).
4 – Fast Fabric TUI Menu QLogic IB Chassis Admin via Fast Fabric This menu is focused on administration of QLogic 9000 series internally managed IB chassis. SilverStorm Technologies Inc. IB Chassis Admin Menu (4.1.1.0.15) Fast Fabric Chassis List: /etc/sysconfig/iba/chassis...
4 – Fast Fabric TUI Menu QLogic IB Chassis Admin via Fast Fabric 4.3.3 Update Chassis Firmware (Switch) This will run the ibtest -C update command to permit the chassis firmware version to be verified and updated as needed. NOTE: The chassis must be running firmware version 4.0.0.4.3 or later to perform...
4 – Fast Fabric TUI Menu 4.3.5 Reboot Chassis (Switch) This will run the ibtest -C reboot command to reboot all the selected chassis and ensure they go down and come back up (as verified via ping over the management network). 4.3.6 Generate all Chassis Problem Report Information (Switch) This will run the captureall -C command to collect configuration and...
4 – Fast Fabric TUI Menu SilverStorm Externally Managed IB Switch Administration via Fast Fabric SilverStorm Externally Managed IB Switch Administration via Fast Fabric This menu is focused on administration of SilverStorm 9024FC externally managed switches. SilverStorm Technologies Inc. IB Switch Admin Menu (4.1.1.0.15) Fast Fabric Externally Managed Switch List: /etc/sysconfig/iba/ibnodes 0) Edit Config and Select/Edit Switch Files...
4 – Fast Fabric TUI Menu NOTE: Consult the relevant switch firmware release notes to ensure any prerequisites for the upgrade to the new firmware level have been met prior to performing the upgrade via Fast Fabric. Prompts will guide the user through options: select - push firmware to each switch and select it for use on next reboot ❥...
Page 54
4 – Fast Fabric TUI Menu SilverStorm Externally Managed IB Switch Administration via Fast Fabric 4-16 D000006-000 Rev A...
Section 5 Detailed Descriptions of Command LineTools Some of the commands are only applicable when Linux is being used. They will be marked with (Linux). Similarly some of the commands are only applicable when QuickSlver Linux IB software is being used on the hosts. Those will be marked with (Host).
5 – Detailed Descriptions of Command LineTools Common Tool Options 5.1.3 Prompt for password for admin on chassis. By default Fast Fabric operations against SilverStorm chassis (such as cmdall, captureall, showallports, and ibtest) obtain the chassis admin password from the FF_CHASSIS_ADMIN_PASSWORD environment variable which may be directly exported or part of fastfabric.conf.
5 – Detailed Descriptions of Command LineTools 5.1.6 Selection of Hosts For operations that are performed against a set of hosts, there are multiple ways to specify the hosts on which to operate: 1. Small sets of hosts can be easily specified on the command line via the -h option discussed below.
5 – Detailed Descriptions of Command LineTools Common Tool Options If a relative path is specified for the -f option or HOSTS_FILE, the current directory will be checked first, followed by /etc/sysconfig/iba/ 5.1.6.1.1 Host List File Format Below is a sample host list file: # this is a comment 192.168.0.4# host identified by IP address n001...
Page 59
5 – Detailed Descriptions of Command LineTools 1. Small sets of chassis can .be easily specified on the command line via the -H option discussed below 2. When multiple commands will be performed against the same small set of chassis, the environment variable CHASSIS can be used to specify a space separated lists of chassis.
Page 60
5 – Detailed Descriptions of Command LineTools Common Tool Options 5.1.7.1.1 Chassis List File Format Below is a sample chassis file: # this is a comment 192.168.0.5# chassis IP address edge1 # chassis resolvable TCP/IP name include /etc/sysconfig/iba/corechassis # included file Each line of the chassis list file may specify a single chassis, a comment or another chassis that list file to include.
5 – Detailed Descriptions of Command LineTools For example: i9k229:0 i9k229:0,1,5 192.168.0.5:0,1,5 NOTE: There must be no spaces within the chassis name and/or slot list. This format is used by cmdall and chassis firmware update. This format may be used anyplace a chassis name or IP address is valid, such as the -H option, the CHASSIS environment variable or chassis list files.
Page 62
5 – Detailed Descriptions of Command LineTools Common Tool Options 4. IBNODES_FILE environment variable 5. /etc/sysconfig/iba/ibnodes file For example if the -N option is used and the IBNODES_FILE environment variable is also exported, the command will operate only on switches specified via the -N option.
5 – Detailed Descriptions of Command LineTools It is recommended that a unique node description be specified for each switch. This name should follow typical naming rules and use the characters a-z, A-Z, 0-9, and underscore. No spaces are allowed in the node description. Additionally, names should not start with a digit.
Page 64
5 – Detailed Descriptions of Command LineTools Common Tool Options 1. -p option 2. PORTS environment variable 3. -t option 4. PORTS_FILE environment variable 5. /etc/sysconfig/iba/ports file 6. default of the first active port on system (0 :0 port specification) For example, if the -p option is used and the PORTS_FILE environment variable is also exported, the command will operate only on ports specified via the -p option.
5 – Detailed Descriptions of Command LineTools Ports are specified as hca:port. No spaces are permitted. The first HCA is 1 and the first Port is 1. The value 0 for HCA or Port has special meaning. The allowed formats are: 0:0 = 1st active port in system 0:y = port y within system x:0 = 1st active port on HCA x...
5 – Detailed Descriptions of Command LineTools Basic Setup and Administration Tools Example: pingall pingall -h 'arwen elrond' HOSTS='arwen elrond' pingall pingall -C pingall -C -H 'chassis1 chassis2' CHASSIS='chassis1 chassis2' pingall -C Environment Variables: The following environment variables are also used by this command: HOSTS, HOSTS_FILE - see discussion on selection of hosts above CHASSIS, CHASSIS_FILE - see discussion on selection of chassis above FF_MAX_PARALLEL - when -p option is used maximum number of parallel...
5 – Detailed Descriptions of Command LineTools Environment Variables The following environment variables are also used by this command: HOSTS, HOSTS_FILE - see discussion on selection of hosts above 5.2.3 setup_ssh (Linux): creates ssh keys and configures them on all hosts so the system can ssh and scp into all other hosts without a password prompt.
Page 68
5 – Detailed Descriptions of Command LineTools Basic Setup and Administration Tools Fast Fabric provides additional flexibility in the translation between IPoIB and management network hostnames. Refer to appendix C for more information. Setup_ssh provides an easy way to create ssh keys and distribute them to the hosts in the cluster.
5 – Detailed Descriptions of Command LineTools If hosts have IP addresses added (for example by installing IB software and enabling IPoIB), IP addresses changes, MAC addresses changed or other aspects have changed (such as server OS reinstallation), the local hosts ssh known_hosts file can be refreshed by running setup_ssh with the -C option.
Page 70
5 – Detailed Descriptions of Command LineTools Basic Setup and Administration Tools Host Examples: cmdall date cmdall 'uname -a' cmdall -h 'elrond arwen' date HOSTS='elrond arwen' cmdall date Chassis Examples: cmdall -C 'ismPortStats' cmdall -C -H 'chassis1 chassis2' ismPortStats CHASSIS='chassis1 chassis2' cmdall ismPortStats Environment Variables The following environment variables are also used by this command: HOSTS, HOSTS_FILE - see discussion on selection of hosts above...
5 – Detailed Descriptions of Command LineTools For operations against chassis use of the -S option is recommended. This avoids the need to keep the password in configuration files. 5.2.5 captureall (Switch and Host): Captures supporting information for a problem report from all hosts or SilverStorm IB chassis and uploads to this system Usage: captureall [-Cp] [-f hostfile] [-F chassisfile] [-h 'hosts']...
Page 72
5 – Detailed Descriptions of Command LineTools Basic Setup and Administration Tools The above example creates a hostcapture directory in ./uploads/<HOSTNAME>/ for each host in /etc/sysconfig/iba/hosts then creates hostcapture.all.tgz. captureall mycapture The above example creates a mycapture directory in ./uploads/<HOSTNAME>/ for each host in /etc/sysconfig/iba/hosts then creates mycapture.all.tgz.
5 – Detailed Descriptions of Command LineTools For operations against chassis use of the -S option is recommended. This avoids the need to keep the password in configuration files. NOTE: The resulting host capture files can require significant amounts of space on the Fast Fabric host.
Page 74
5 – Detailed Descriptions of Command LineTools File Management Tools -f hostfile - file with hosts in cluster, default is /etc/sysconfig/iba/hosts. -u user - user to perform copy to, default is current user code source_file: the name of files to copy from this system, relative to the current directory.
5 – Detailed Descriptions of Command LineTools 5.3.2 uploadall (Linux): Copies one or more files from a group of hosts to this system. Since the file name will be the same on each host, a separate directory on this system is created for each host and the file is copied to it.
5 – Detailed Descriptions of Command LineTools File Management Tools Example: # upload two files from 2 hosts uploadall -h 'arwen elrond' capture.tgz /etc/init.d/ipoib.cfg . # upload two files from all hosts uploadall capture.tgz /etc/init.d/ipoib.cfg . # upload network config files from all hosts uploadall -r -p /etc/sysconfig/network-scripts network-scripts # upload two files to a specific subdirectory of upload_dir uploadall capture.tgz /etc/init.d/ipoib.cfg pre-install...
Page 77
5 – Detailed Descriptions of Command LineTools -f hostfile - file with hosts in cluster. The default is /etc/sysconfig/iba/hosts. -h hosts - the list of hosts to download files to -u user - the user to perform the copy. The default is current user code -d download_dir - the directory to download files to.
5 – Detailed Descriptions of Command LineTools File Management Tools 5.3.4 Simplified Editing of Node-Specific Files (Linux): The combination of uploadall and downloadall provide a powerful yet simple to use mechanism for reviewing and/or editing node-specific files without the need to login to each node. This is best explained with an example.
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.1 Fabric_info Fabric_info provides a brief summary of the components in the fabric. Fabric_info uses the first active IB port on the given local host to perform its analysis. Example output: Fabric_info Fabric_info has no options and uses no environment variables.
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools Number of 1x Ports - number of ports in the fabric running at 1x speed. Typically such ports represent a bad cable connection, a bad cable, too long a cable or perhaps faulty hardware on one side of the link. Fabric_info can be very useful as a quick assessment of the fabric state.
5 – Detailed Descriptions of Command LineTools HOSTS - a list of hosts, used if the -h option is not supplied CHASSIS - a list of chassis, used if the -C is used and the -h option is not supplied HOSTS_FILE - a file containing the list of hosts, used in absence of -f and -h CHASSIS_FILE - a file containing the list of chassis, used in absence of -F and -H...
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools For operations against chassis use of the -S option is recommended. This avoids the need to keep the password in configuration files. When performing showallports against externally-managed switches it requires an IB-enabled management node with Fast Fabric installed. Typically this will be the Fast Fabric node from which showallports is being run.
Page 83
5 – Detailed Descriptions of Command LineTools output in the same order even if components have been rebooted. This is useful for comparison using simple tools like diff. iba_report permits multiple reports to be requested for a single run (i.e., 1 of each report type). By default iba_report uses the first active port on the local system.
Page 84
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools -D/--dest point - destination for trace route -Q/--quietfocus - do not include focus description in report Report Types: comps - summary of all systems and SMs in fabric brcomps - brief summary of all systems and SMs in fabric nodes - summary of all node types and SMs in fabric brnodes - brief summary of all node types and SMs in fabric ious - summary of all IO units in the fabric...
Page 85
5 – Detailed Descriptions of Command LineTools nodeguid:value1:port:value2 - value1 is numeric node GUID, value2 is port # iocguid:value - value is numeric IOC GUID iocguid:value1:port:value2 - value1 is numeric IOC GUID, value2 is port # systemguid:value - value is numeric system image GUID systemguid:value1:port:value2 - value1 is numeric system image GUID value2 is port #...
Page 86
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools Examples: iba_report can generate hundreds of different reports. Following is a list of some commonly generated reports: Analyze a fabric for bad cables: iba_report -o slowlinks -o errors Analyze a fabric for bad cables or misconfigured ports: iba_report -o slowconfiglinks -o errors Analyze a fabric for bad cables or misconfigured ports or misconnected ports: iba_report -o slowconnnlinks -o errors...
Page 87
5 – Detailed Descriptions of Command LineTools Identify the routes between this server and another server: iba_report -o route -D node:goblin Analyze a single switch for any high error counts: iba_report -o errors -F 'node:i9k156' Identify the routes between a server and an IOC: iba_report -o route -S node:duster -D 'ioc:Chassis 0x00066A005000010C, Slot 2, IOC 2'...
Page 88
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools Here is a sample of iba_report for a small fabric: [root@duster root]# iba_report Node Type Brief Summary 14 Connected CAs in Fabric: NodeGUID Type Name Port LID PortGUID Width Speed 0x0002c9020020e0d4 CA coyote1 1 0x000d 0x0002c9020020e0d5 2.5Gb...
Page 89
5 – Detailed Descriptions of Command LineTools 1 0x0011 0x00066a00a000447b 2.5Gb 2 0x0012 0x00066a01a000447b 2.5Gb 0x00066a0098004a73 CA erik 1 0x0009 0x00066a00a0004a73 2.5Gb 3 Connected Switches in Fabric: NodeGUID Type Name Port LID PortGUID Width Speed 0x00066a00280002cd SW InfiniCon Systems InfiniFabric (Sw A Dev 0 0x0013 0x00066a00280002cd Noop Noop 2.5Gb...
Page 90
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 0 0x0010 0x00066a10280002cd Noop Noop 2.5Gb 2.5Gb 1 Connected SMs in Fabric: State GUID Name Master 0x00066a00d8000123 InfiniCon Systems InfinIO9024 Each iba_report allows for various levels of detail. Increasing detail is shown as further indentation of the additional information.
Page 91
5 – Detailed Descriptions of Command LineTools permitted. The maximum detail per report varies, but most have less than 5 detail levels. For example, the above report when run at detail level 0 outputs: [root@duster root]# iba_report -d 0 Node Type Brief Summary 14 Connected CAs in Fabric: 3 Connected Switches in Fabric: 1 Connected SMs in Fabric:...
Page 92
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 0x00066a00d8000123 SW InfiniCon Systems InfinIO9024 0x00066a10280002cd SW InfiniCon Systems InfiniFabric (Sw A Dev 1 Connected SMs in Fabric: State GUID Name Master 0x00066a00d8000123 InfiniCon Systems InfinIO9024 The above examples were all performed with a single report, the brnodes (Brief Nodes) report.
Page 93
5 – Detailed Descriptions of Command LineTools However, iba_report does not stop there. Additionally, iba_report has reports that will help to analyze the operational characteristics of the fabric and help to identify bottlenecks and faulty components in the fabric. To assist in this area, iba_report also supports the following reports: slowlinks - identifies links which are running slower than expected.
Page 94
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.3.2 Topology Verification iba_report provides a flexible way to identify changes to the fabric or the appropriate reassembly of the fabric after a move (for example after staging and testing the fabric in a remote location before final installation at a customer site). In this mode of operation, all the above reports are available, however the types of information output can be filtered.
Page 95
5 – Detailed Descriptions of Command LineTools When focusing a report, it can sometimes be helpful to also use a detail level of 0 or 1. In this case the report will show only a count of number of matches (for detail 0) and just the highest level of the entity which matches (for detail 1).
Page 96
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.3.3.2 Focus Examples: Below are some examples of using the focus options: iba_report -o nodes -F portguid:0x00066a00a000447b iba_report -o nodes -F nodeguid:0x00066a009800447b:port:1 iba_report -o nodes -F nodeguid:0x00066a009800447b iba_report -o nodes -F node:duster iba_report -o nodes -F node:duster:port:1 iba_report -o nodes -F 'nodepat:d*' iba_report -o nodes -F 'nodepat:d*:port:1'...
Page 97
5 – Detailed Descriptions of Command LineTools 5.4.3.4.1 Using iba_report to monitor for fabric changes iba_report can easily be used in other scripts. For example the following simple script could be run as a cron job to identify if the fabric has changed as compared to the initial design: #!/bin/bash # specify some filenames to use...
Page 98
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.3.5 Sample Output 5.4.3.5.1 Analysis of all ports in fabric for errors, inconsistent connections, bad cables [root@duster root]# iba_report -o errors -o slowconnlinks Links running slower than faster port Summary Links running slower than expected: 20 of 20 Links Checked, 0 Errors found Links configured to run slower than supported:...
Page 99
5 – Detailed Descriptions of Command LineTools ExcessiveBufferOverrunErrors VL15Dropped Rate NodeGUID Port Type Name 10g 0x00066a0098000001 1 CA julio <-> 0x00066a00d8000123 8 SW InfiniCon Systems InfinIO9024 LinkDownedCounter: 5 Exceeds Threshold: 3 10g 0x00066a00980001b8 1 CA orc <-> 0x00066a00d8000123 10 SW InfiniCon Systems InfinIO9024 LinkDownedCounter: 5 Exceeds Threshold: 3 10g 0x00066a0098000380 1 CA goblin...
Page 100
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.3.5.2 Identification of the route between 2 nodes in the fabric [root@duster root]# ./iba_report -o route -S node:orc -D node:julio Routes Summary Between: Node: 0x00066a00980001b8 CA orc and Node: 0x00066a0098000001 CA julio Routes between ports: 0x00066a00980001b8 1 CA orc...
Page 101
5 – Detailed Descriptions of Command LineTools 5.4.3.5.3 Analysis of the route between 2 nodes for errors, inconsistent connections, etc [root@duster root]# ./iba_report -o errors -o slowconnlinks -Froute:node:orc:no e:julio Links running slower than faster port Summary Focused on: 4 Ports: 1 0x00066a00a00001b8 in Node: 0x00066a00980001b8 CA orc in Node: 0x00066a00d8000123 SW InfiniCon Systems InfinIO9024...
Page 102
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools Obtain very detailed information about nodes NOTE: To shorten the length of the output, the following example focuses on only 1 node. [root@duster root]# iba_report -o nodes -F node:erik -d 5 -s Node Type Summary Focused on: System: 0x00066a0098004a73 Node: 0x00066a00980003a6 CA erik...
Page 103
5 – Detailed Descriptions of Command LineTools Link Error Recovery Link Downed Port Rcv Errors Port Rcv Rmt Phys Err Port Rcv Sw Relay Err Port Xmit Discards Port Xmit Constraint Port Rcv Constraint Local Link Integrity Exc. Buffer Overrun VL15 Dropped Name: erik NodeGUID: 0x00066a0098004a73 Type: CA...
Page 104
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.3.5.5 Obtain very detailed information about IOUs NOTE: To shorten the length of the output, the following example focuses on only 1 IOC. [root@duster root]# iba_report -o ious -F ioc:'Chassis 0x00066A005000010C, Slot 2, IOC 2' -d 5 IOU Summary Focused on: Ioc: 2 0x00066a02300001e0 Chassis 0x00066A005000010C, Slot 2, IOC 2...
Page 105
5 – Detailed Descriptions of Command LineTools 10g 0x00066a00980001b8 1 CA orc <-> 0x00066a00d8000123 10 SW InfiniCon Systems InfinIO9024 10g 0x00066a0098000380 1 CA goblin <-> 0x00066a00d8000123 15 SW InfiniCon Systems InfinIO9024 2.5g 0x00066a0098000384 1 CA cuda <-> 0x00066a00d8000123 2 SW InfiniCon Systems InfinIO9024 10g 0x00066a0098000384 2 CA cuda <->...
Page 106
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.3.5.7 Reverse lookups, translate a LID or GUID into the information about the node or port represented [root@duster root]# iba_report -o nodes -F lid:5 Node Type Summary Focused on: Port: 1 0x00066a00a0000384 in Node: 0x00066a0098000384 CA cuda 13 Connected CAs in Fabric:...
Page 107
5 – Detailed Descriptions of Command LineTools 5.4.3.5.8 Forward lookups - lookup nodes or IOCs by name [root@duster root]# iba_report -o nodes -F node:erik Node Type Summary Focused on: System: 0x00066a0098004a73 Node: 0x00066a00980003a6 CA erik Node: 0x00066a0098004a73 CA erik 13 Connected CAs in Fabric: Name: erik NodeGUID: 0x00066a00980003a6 Type: CA Ports: 2 PartitionCap: 64 SystemImageGuid: 0x00066a0098004a73...
Page 108
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.3.5.9 Generate reports in a "comparible manner" so topology verification can be performed against a known good configuration NOTE: To shorten the length of the output, the following example focuses on only 1 node.
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools 5.4.4 saquery (All): saquery can perform various queries of the subnet manager/subnet agent and provide detailed fabric information. In many cases iba_report provides a more powerful tool, however in some cases saquery is preferred, especially when dealing with service records and multicast.
Page 111
5 – Detailed Descriptions of Command LineTools -A/--gidlist 'sgid ...;dgid ...': query by a list of Gids -o/--output type: output type for query (default is node) Node Types: ca - channel adapter sw - switch rtr - router GIDs: Specify a 64 bit subnet and 64 bit interface ID as: subnet:interface. For example: 0xfe80000000000000:0x00066a00a0000380 Output Types:...
Page 112
5 – Detailed Descriptions of Command LineTools Fabric Analysis Tools mcfdb: list of switch multicast FDB records trace: list of trace records The following combinations of input (assorted query by options) and output (-o) are permitted: Table 5-1. Input Combinations -o output Input option permitted...
5 – Detailed Descriptions of Command LineTools Advanced Initialization and Verification - ibtest Advanced Initialization and Verification - ibtest (Switch and Host) Ibtest performs a number of multi-step operations. In general operations performed by ibtest involve a login to one or more target systems (hosts or SilverStorm IB chassis depending on options used).
Page 115
5 – Detailed Descriptions of Command LineTools mpi mpidev mpisrc udapl sdp rds, or for a chassis upgrade, filenames/directories of firmware images to install. For directories specified, all .pkg files in directory tree will be used. shell wildcards may also be used within quotes, or for a switch upgrade, filename/directory of firmware image to install.
Page 116
5 – Detailed Descriptions of Command LineTools Advanced Initialization and Verification - ibtest Ibtest provides detailed logging of its results. During each run the following files are produced: test.res: appended with summary results of run test.log: appended with detailed results of run save_tmp/: contains a directory per failed test with detailed logs test_tmp*/: intermediate result files while test is running The -c option will remove all of the above.
5 – Detailed Descriptions of Command LineTools FF_PASSWORD: password to use to login as FF_USERNAME. Used in absence of -S option. FF_ROOTPASS: password to use when su to root (if FF_USERNAME is not root). Used in absence of -S option. FF_LOGIN_METHOD: how to login to hosts (Telnet, RSH or SSH), default is FF_TIMEOUT_MULT: multiplier for response timeouts.
Page 118
5 – Detailed Descriptions of Command LineTools Advanced Initialization and Verification - ibtest to select different install packages, the defaults are iba ipoib mpi (i.e., IB Stack, IPoIB and MPI). The default is the typical configuration for an MPI cluster compute node.
Page 119
5 – Detailed Descriptions of Command LineTools 5.5.1.5 sacache This verifies the given hosts have properly communicated with the SA and cached paths to each other. To run this command, InfiniBand must be installed and running on the given hosts. The subnet manager and switches must be up. If this test fails, cmdall 'cat /proc/driver/ics_dsc/gids' can be run against any problem nodes to see what they have cached.
5 – Detailed Descriptions of Command LineTools Advanced Initialization and Verification - ibtest numbers representative of what most servers can achieve. Some server models may have 10-20% higher results. A result 5-10% below the above numbers is typically not cause for serious alarm, but may reflect limitations in the server design or the chosen BIOS settings.
5 – Detailed Descriptions of Command LineTools 5.5.2.2 reboot This reboots the given chassis and ensures they go down and come back up by pinging them during the reboot process. By selecting the proper FF_MAX_PARALLEL value a rolling reboot or a parallel reboot may be accomplished.
5 – Detailed Descriptions of Command LineTools Advanced Initialization and Verification - ibtest 5.5.4 Interpreting the ibtest log files Each run of ibtest will create test.log and test.res files in the current directory. When ibtest indicates that some or all of the test cases failed, the test.res and test.log files should be reviewed.
5 – Detailed Descriptions of Command LineTools match the intended host names. Also make sure than when IPoIB host names are used, that the correct name was formed based on the ibtest -i '<IPOIB SUFFIX>' argument. This applies a suffix to host names to create IPoIB host names.
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools Compares present HW/SW/config to baseline ❑ Can be scheduled in hourly cron jobs ❑ As needed rerun "baseline" when expected changes occur ❥ Fabric upgrades ❑ Hardware replacements/changes ❑...
Page 125
5 – Detailed Descriptions of Command LineTools present. Also verify the appropriate links between servers and switches are present. If the fabric is not correctly configured, correct the configuration and rerun the baseline. Once a good baseline has been established, use the tools to compare the present fabric against the baseline and check its health.
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools previous baseline which is to be compared to (except when -e option is used). The FF_ANALYSIS_DIR option in fastfabric.conf can be changed to provide a customer specific alternate directory which will be used whenever the -d option is not specified.
Page 127
5 – Detailed Descriptions of Command LineTools fabric components (nodes, links, SMs, systems, and their SMA configuration) ❥ fabric PMA error counters and link speed mismatches ❥ Note that the comparison includes components on the fabric. Therefore operations such as shutting down a server will cause the server to no longer appear on the fabric and will be flagged as a fabric change or failure by fabric_analysis.
Page 128
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools options and reports to be used for the health analysis. It also can specify the PMA counter clearing behavior (-i seconds, -C, or none at all). The thresholds for PMA counter analysis default to /etc/sysconfig/iba/iba_mon.conf.
Page 129
5 – Detailed Descriptions of Command LineTools latest/fabric.0:0.links.diff - diff of baseline and latest fabric internal and external links. The .diff files are only created if differences are detected. If the -s option is used and failures are detected, files related to the checks that failed are also copied to the timestamped directory name under FF_ANALYSIS_DIR, such as: FF_ANALYSIS_DIR/2007-11-22-09:53:04...
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools 5.6.3.2 IB fabric Items that are also checked during healthcheck Based on iba_report -o errors -o slowlinks: PMA error counters on all IB ports (HCA, switch external and switch internal) ❥...
Page 131
5 – Detailed Descriptions of Command LineTools Environment Variables The following environment variables are also used by this command: CHASSIS, CHASSIS_FILE - see the discussion on the selection of chassis above. FF_TIMEOUT_MULT - multiplier for response timeouts. The default is 2. this typically does not need to be set, but in the event of unexpected timeouts or extremely slow chassis or management network, a larger value can be used.
Page 132
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools All files generated by fabric_analysis start with chassis. in the file name. The chassis_analysis tool generates files such as the following within FF_ANALYSIS_DIR. The actual file names reflect the individual chassis commands that have been configured via the FF_CHASSIS_HEALTH and FF_CHASSIS_CMDS parameters: Health Check:...
Page 133
5 – Detailed Descriptions of Command LineTools latest/chassis.fwVersion - the output of the fwVersion command for all selected chassis. latest/chassis.fwVersion.diff - diff of the baseline and latest fwVersion. latest/chassis.ismChassisSet12x - the output of the ismChassisSet12x command for all selected chassis. latest/chassis.ismChassisSet12x.diff - the diff of the baseline and latest ismChassisSet12x.
Page 134
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools latest/chassis.timeDSTConf - the output of the timeDSTConf command for all selected chassis. latest/chassis.timeDSTConf.diff - diff of the baseline and latest timeDSTConf. latest/chassis.timeZoneConf - the output of the timeZoneConf command for all selected chassis.
5 – Detailed Descriptions of Command LineTools changes to SNMP persistent configuration within the chassis ❥ The following Chassis items will not be checked against baseline: changes to the chassis configuration on the management LAN (e.g., ❥ showChassisIpAddr, showDefaultRoute). Such changes will typically result in the chassis not responding on the LAN at the expected address that is detected by failures that will perform other chassis checks.
Page 136
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools The hostsm_analysis tool performs analysis against the local server only. It is assumed that both the host SM and Fast Fabric are installed on the same system. Environment Variables The following environment variables are also used by this command: FF_CURTIME - the timestamp to use on the directory created in FF_DIFF_CMD - Linux command to use to compare baseline to latest...
5 – Detailed Descriptions of Command LineTools 5.6.5.1 Host SM items checked against the baseline SM configuration file ❥ The version of the SM rpm installed on the system ❥ 5.6.5.2 Host SM items also checked during healthcheck - The SM is in the running state 5.6.6 esm_analysis (Switch): The esm_analysis command has the following usage:...
Page 138
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools or extremely slow chassis or management network, a larger value can be used. FF_CHASSIS_LOGIN_METHOD - how to login to a chassis. Can be SSH or Telnet. FF_CHASSIS_ADMIN_PASSWORD - the password for administrator on all chassis.
Page 139
5 – Detailed Descriptions of Command LineTools Full analysis: latest/esm.smstatus - the output of the smControl status command for all selected chassis. latest/esm.smShowDefBcGroup - the output of the smShowDefBcGroup command for all selected chassis. latest/esm.smShowDefBcGroup.diff - diff of baseline and latest smShowDefBcGroup.
5 – Detailed Descriptions of Command LineTools Note that the all_analysis command has options which are a superset of the options for all other analysis commands. The options will be passed along to the respective tools (e.g., the -c file option will be passed on to fabric_analysis if it is specified in FF_ALL_ANALYSIS).
Page 142
5 – Detailed Descriptions of Command LineTools Health Check and Baselining Tools automated ❥ In both cases the user should follow the initial setup procedure outlined above to create a good baseline of the configuration. In the manual method, the user would run the tools manually when trying to diagnose problems, or when there is a concern or need to validate the configuration and health.
Section 6 MPI Sample Applications As part of a MPI Development installation, some sample MPI applications and benchmarks are installed to /opt/iba/src/mpi_apps. These can be used to perform basic tests and performance analysis of MPI and its performance. As part of this package the following sample applications are provided: OSU latency (2 versions) ❥...
6 – MPI Sample Applications OSU Latency For all but the mandel and tachyon demos, when MPI applications are run with the scripts provided, the results of the run will be logged to a file in /opt/iba/src/mpi_apps/logs. The file name will include the date and time of the run for uniqueness.
6 – MPI Sample Applications This will run assorted bandwidths from 4K to 4Mbytes. To run a different set of message sizes an optional argument specifying the maximum message size can be provided. This benchmark will only use the first two nodes listed in mpi_hosts. During this benchmark the /opt/iba/src/mpi_apps/mpi.param.pallas config file is used.
Page 146
6 – MPI Sample Applications Pallas Prior to running this application, a HPL.dat file must be installed into /opt/iba/src/mpi_apps/hpl/bin/ICS/HPL.dat on all nodes. The config_hpl script and some sample configurations are included. The config_hpl script can select from one of the assorted HPL.dat files in hpl-config.
Page 147
6 – MPI Sample Applications Pallas has known scalability limitations, especially in its AllToAll phase. This phase can simultaneously perform up to 4MB transfers to-and-from all nodes at once. The downside is a system must have approx 10*NP MB of memory available per process for Pallas data to run this benchmark.
Appendix A Fast Fabric Quick Install Checklist The sections below provide a checklist to aid tracking the steps have been completed for Fabric Setup, Installation and verification. Check off each step as its performed. Refer to the section “Getting Started” on page 3-1 for a more detailed explanation of each step.
A – Fast Fabric Quick Install Checklist Installing and verifying Firmware on the IB Chassis Installing and verifying Firmware on the IB Chassis 1. All Chassis connected to management network. 2. Unique IP address configured for each chassis. 3. Unique name configured for each chassis. 4.
A – Fast Fabric Quick Install Checklist 5. Setup password-less SSH/SCP. 6. TCP/IP Name Resolution configured on all hosts. a. if using /etc/hosts - Copy /etc/hosts to all hosts. b. if using DNS - /etc/resolv.conf copied or configured on all hosts. 7.
A – Fast Fabric Quick Install Checklist Configure and initialize health check tools Configure and initialize health check tools This procedure should be followed on each IB management node from which the health check tools will be used. 1. Edit fastfabric.conf and review the health check tools parameters. 2.
Appendix B Fast Fabric Configuration Files The following configuration files are used by Fast Fabric:. Table B-1. Fast Fabric Configuration Files Configuration File Description /etc/sysconfig/fastfabric.conf Overall configuration file /etc/sysconfig/iba/iba_mon.conf Error thresholds /etc/sysconfig/iba/allhosts List of all hosts managed by fast fabric including the localhost /etc/sysconfig/iba/hosts List of all hosts managed by fast fabric...
Page 154
B – Fast Fabric Configuration Files fastfabric.conf NOTE: Do not edit /etc/sysconfig/fastfabric.conf-sample. The use of various configuration variables are discussed in the Environment Variables section for each command. #!/bin/bash # [ICS VERSION STRING: unknown] # This is a bash sourced config file which defines variables used # fast fabric tools.
Page 155
B – Fast Fabric Configuration Files # The shell functions below are only defined if no existing function/command # with given name, hence allowing use of shell functions or creation of a # command for this operation # shell Function to convert a basic hostname into an IPoIB hostname # if FF_IPOIB_SUFFIX is "", this should return $1 unmodified # such that commands can be used with -i ""...
Page 156
B – Fast Fabric Configuration Files fastfabric.conf # set to 1 to avoid parallel execution export FF_MAX_PARALLEL=${FF_MAX_PARALLEL:-20} # If the systems are slow for some reason, this can be used to provide a # multiplier for all timeouts in ibtest export FF_TIMEOUT_MULT=${FF_TIMEOUT_MULT:-2} # InfiniServ product to install during ibtest load and ibtest upgrade...
Page 157
B – Fast Fabric Configuration Files export FF_PASSWORD="${FF_PASSWORD:-}" # if FF_USERNAME is not root, what is the root password needed when # suing to root export FF_ROOTPASS="${FF_ROOTPASS:=}" # How to login to chassis # can be ssh or telnet export FF_CHASSIS_LOGIN_METHOD="${FF_CHASSIS_LOGIN_METHOD:-telnet}"...
B – Fast Fabric Configuration Files iba_mon.conf export FF_FABRIC_HEALTH="${FF_FABRIC_HEALTH:- -s -C -o errors -o slowlinks}" # list of CLI commands to issue during chassis_analysis export FF_CHASSIS_CMDS="${FF_CHASSIS_CMDS:-showInventory fwVersion showIBNodeDesc ismShowPStatThresh ismChassisSet12x timeZoneConf timeDSTConf snmpCommunityConf snmpTargetAddr showChassisIpAddr showDefaultRoute}" # other possible additions (if running newer chassis FW which supports these) # ismIslSet12x, ismIslSetSpeed # single CLI command to issue to check overall health during...
Page 159
B – Fast Fabric Configuration Files NOTE: Do not edit /etc/sysconfig/iba/iba_mon.conf-sample. D000006-000 Rev A...
Page 160
B – Fast Fabric Configuration Files iba_mon.conf # This file controls the iba_mon Port Counter monitoring Thresholds. # [ICS VERSION STRING: unknown] # Error Counters are specified in absolute number of errors over Interval. # All Data Movement thresholds are specified in terms of average data/second # over the monitoring interval.
B – Fast Fabric Configuration Files Host List Files The /etc/sysconfig/iba/hosts and /etc/sysconfig/iba/allhosts files are used to specify the hosts which Fast Fabric will operate against for many operations. If desired alternate filenames may be specified in fastfabric.conf, via environment variables or on the command line. Refer to the section “Selection of Hosts”...
B – Fast Fabric Configuration Files Selection of slots within a chassis # this is a comment 192.168.0.5# chassis IP address edge1# chassis resolvable TCP/IP name include /etc/sysconfig/iba/corechassis# included file Each line of the chassis list file may specify a single chassis, a comment or another chassis list file to include.
B – Fast Fabric Configuration Files 9000 series chassis, slot 0 is always an alias for the presently active management card for the chassis. For the remainder of slot usages in the chassis, the chassisQuery command can be executed against a given chassis to identify which slots have management, EVIC or FVIC cards.
B – Fast Fabric Configuration Files Port List Files For externally-managed switches, the node GUID can be found on a label on the bottom of the switch. Alternately the node GUIDs for switches in the fabric can be found use a command such as: saquery -t sw -o nodeguid NOTE: The above command will report all switch node GUIDs, including those...
Page 165
B – Fast Fabric Configuration Files Comments may be placed on any line. By using a # to precede the comment. On lines with a port or include directive, the # must be white-space separated from any preceding port or included file name. D000006-000 Rev A B-13...
Page 166
B – Fast Fabric Configuration Files Port List Files B-14 D000006-000 Rev A...
Page 167
Appendix C Configuration of IPoIB Name Mapping The Fast Fabric tools support the concept of a management network and an IPoIB network. For some clusters the management network will be a low speed network such as 10/100 Ethernet. For other clusters IPoIB may serve double duty as the host management network.
Page 168
C – Configuration of IPoIB Name Mapping D000006-000 Rev A...
Appendix D Multi-Subnet Fabrics Fast Fabric is designed primarily to manage a single subnet fabric. However many powerful functions of FastFabric are also available when installing and operating multi-subnet fabrics. When operating a multi-subnet fabric, an subnet manager (SM) is required for each subnet.
Page 170
D – Multi-Subnet Fabrics Primarily Independent Subnets typically with different IP subnets. Consult the QuickSilver Fabric Access Software Users Guide for more information on configuring IPoIB. NOTE: When managing a cluster where compute nodes are not running the QuickSilver host stack or where the IPoIB settings on the compute nodes are incompatible with the IB Management node (e.g., when a 4K MTU is used on the compute nodes), it is recommended not to run IPoIB on the IB management node(s).
D – Multi-Subnet Fabrics (All): However instead it is recommended to run: ❥ iba_report -i 10 -o errors -o slowlinks -h x -p y where x and y specify the applicable HCA and port to select the desired subnet. Repeat for each subnet. (Host): “Verify Hosts see each other”...
Page 172
D – Multi-Subnet Fabrics Overlapping Subnets fabric operation. Follow the installation instructions outlined in “Getting Started” on page 3-1 with the following adjustments: From “Design the Fabric” on page 3-1, design the cabling such that the Fast Fabric node will be connected to each IB subnet it will manage. The Fast Fabric node must also have a management network path to all the nodes in all the subnets it will manage.
Page 173
D – Multi-Subnet Fabrics (All): Create the allhosts file per the instructions. Next, create additional files ❥ per subnet that list all the hosts in each subnet including the IB management node. (All): “Verify Hosts via Ethernet ping” on page 4-4 can be performed per the ❥...
Page 174
D – Multi-Subnet Fabrics Overlapping Subnets variable may be used to specify all the HCAs and ports on the IB management node such that all subnets are checked. Similarly, the esm_chasssis and chassis files used should list all relevant SilverStorm IB chassis in all subnets. “Running HPL”...
Need help?
Do you have a question about the Fast Fabric and is the answer not in the manual?
Questions and answers