Silicon Graphics Altix ICE 8000 Series Quick Reference Manual

Scali manage on sgi altix ice system
Table of Contents

Advertisement

Quick Links

®
®
TM
Scali Manage
On SGI
Altix
ICE
System Quick Reference Guide
007–5450–001

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the Altix ICE 8000 Series and is the answer not in the manual?

Questions and answers

Summary of Contents for Silicon Graphics Altix ICE 8000 Series

  • Page 1 ® ® Scali Manage On SGI Altix System Quick Reference Guide 007–5450–001...
  • Page 2 52.227-14. TRADEMARKS AND ATTRIBUTIONS SGI, the SGI logo, and Altix are registered trademarks and SGI ProPack is a trademark of SGI in the United States and/or other countries worldwide. Altair is a registered trademark and PBS Professional is a trademark of Altair Engineering, Inc. Intel, Xeon, and Itanium are trademarks or registered trademarks of Intel Corporation.
  • Page 3 Record of Revision Version Description April 2008 Original publication. 007–5450–001...
  • Page 5: Table Of Contents

    Related Publications Obtaining Publications Conventions xvii Reader Comments xvii 1. SGI Altix ICE 8000 Series System Overview ..Hardware Overview Basic System Building Blocks InfiniBand Fabric Gigabit Ethernet Network Individual Rack Unit...
  • Page 6 Contents Storage Service Node Networks Networks Overview Gigabit Ethernet (GigE) and 10/100 Ethernet Connections VLANs InfiniBand Fabric Network Interface Naming Conventions Ethernet Networks InfiniBand Networks System Admin Controller Service Nodes Rack Leader Controllers Chassis Management Control (CMC) Blade Compute Nodes 2.
  • Page 7 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide Configuration Session Example Using the Scali Manage GUI Displaying Cluster Components Scali Manage Troubleshooting Tips Compute Node RPMs Compute Node RPMs on SLES Compute Node RPMs on RHEL 3.
  • Page 9 Figures Figure 1-1 Basic System Building Blocks Figure 1-2 Chassis Manager Cabling Figure 1-3 Service Nodes Figure 1-4 Network Connections In a System With Two IRUs Figure 1-5 Chassis Manager VLAN_GBE and VLAN_BMC Network Connections - IRU View Figure 1-6 Figure 1-7 VLAN_GBE and VLAN_BMC Network Connections –...
  • Page 11 Examples Example A-1 opensm-ib0.conf and opensm-ib.conf Configuration Files 007–5450–001...
  • Page 13 Procedures Procedure A-1 Configuring and Initializing the InfiniBand Fabric Manually 007–5450–001 xiii...
  • Page 15: About This Guide

    About This Guide This guide is a reference document for people who manage the operation of SGI Altix ICE 8000 series systems running SUSE Linux Enterprise Server 10 Service Pack 1 or Red Hat Enterprise Linux 5.1 (RHEL5.1) with SGI ProPack 5 for Linux Service Pack 4 (or later).
  • Page 16: Obtaining Publications

    This library contains the most recent and most comprehensive set of online books, release notes, man pages, and other information. • Online versions of the SGI ProPack 5 for Linux Service Pack 4 Start Here, the SGI ProPack 5 SP4 release notes, which contain the latest information about software...
  • Page 17: Conventions

    Reader Comments If you have comments about the technical accuracy, content, or organization of this publication, contact SGI. Be sure to include the title and document number of the publication with your comments. (Online, the document number is located in the front matter of the publication.
  • Page 18 About This Guide Sunnyvale, CA 94085–4602 SGI values your comments and will respond to them promptly. xviii 007–5450–001...
  • Page 19: Sgi Altix Ice 8000 Series System Overview

    Chapter 1 SGI Altix ICE 8000 Series System Overview An SGI Altix ICE 8000 series system is an integrated blade environment that can scale to thousands of nodes. The Scali Manage management software enables you to provision, install, configure, and manage your system. This chapter provides an overview of the SGI Altix ICE 8000 series system and covers the following topics: •...
  • Page 20: Infiniband Fabric

    1: SGI Altix ICE 8000 Series System Overview 42U High Rack Rack leader controller Independent Rack Unit (IRU) Power supplies Power supplies Admin server Chassis manager InfiniBand InfiniBand switch blade switch blade Figure 1-1 Basic System Building Blocks This hardware overview section covers the following topics: •...
  • Page 21: Power Supply

    • "Chassis Manager" on page 6 InfiniBand Fabric The SGI Altix ICE 8000 series system topology is based on an InfiniBand interconnect. Internal InfiniBand switch ASICs of the IRU eliminate the need for external InfiniBand switches. The dual high-speed, low-latency double data rate (DDR) InfiniBand backplanes built into the IRUs provide for fast communication between...
  • Page 22 1: SGI Altix ICE 8000 Series System Overview Typically, each process of an MPI job runs exclusively on a processor. Multiple processes can share a single processor, through standard Linux context switching, but this can have a significant effect on application performance. A parallel program can only finish when all of its sub-processes have finished.
  • Page 23 • Baseboard Management Controller (BMC) – one per compute node, admin node, leader node, and managed service node Unlike traditional, flat clusters, the SGI Altix ICE 8000 series system does not have a head node. The head node is replaced by a hierarchy of nodes that enables system resources to scale as you add processors.
  • Page 24: Chassis Manager

    It is the VLAN logical networks that help prevent network traffic bottlenecks. Note: Understanding the VLAN logical networks is critical to administering an SGI Altix ICE system. For more detailed information, see "VLANs" on page 16 and "Network Interface Naming Conventions"...
  • Page 25: System Nodes

    Figure 1-3 on page 12 shows cabling for a service node and storage service node (NAS cube). System Nodes This section describes the system nodes that are part of SGI Altix ICE 8000 series system and covers the following topics: • "System Admin Controller" on page 8 •...
  • Page 26: Compute Node

    Figure 1-3 on page 12 and Figure 1-4 on page 14. An InfiniBand fabric connects it to the compute nodes within its rack and compute nodes in other racks. The leader node is an appliance node. It always runs software specified by SGI. The rack leader controller (leader node) does the following:...
  • Page 27 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide • Runs the fabric management software to monitor and function the InfiniBand fabric on one or more leader nodes in your Altix ICE system • Monitors, functions, and receives data from the IRUs within its rack •...
  • Page 28 Note: The LCD control panel is not operational for the first release. Individual Rack Unit The individual rack unit (IRU) is one of the basic building blocks of the SGI Altix ICE 8000 series system as shown in Figure 1-1 on page 2. It is described in detail in "Basic System Building Blocks"...
  • Page 29 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide Gateway Service Node The gateway service node is the gateway to services on the public network, such as, storage, lightweight directory access protocol (LDAP) services, and file transfer protocol (FTP).
  • Page 30: Networks

    Figure 1-3 Service Nodes Networks This section describes the Gigabit Ethernet (GigE) and 10/100 Ethernet connections and the InfiniBand fabric in an SGI Altix ICE 8000 series system and covers the following topics: • "Networks Overview" on page 13 • "Gigabit Ethernet (GigE) and 10/100 Ethernet Connections" on page 14 •...
  • Page 31: Infiniband Fabric

    • "InfiniBand Fabric" on page 21 Networks Overview This section describes the various network connections in the SGI Altix ICE 8000 series system. Users access the system via a public network through services nodes such as the login node and the batch service node, as shown in Figure 1-4 on page 14.
  • Page 32: Figure 1-4 Network Connections In A System With Two Irus

    Figure 1-4 Network Connections In a System With Two IRUs Gigabit Ethernet (GigE) and 10/100 Ethernet Connections The SGI Altix ICE 8000 series system has several Ethernet networks that facilitate booting and managing the system. These networks are built onto the backplane of each IRU for connection to the compute blades and transverse cables between IRUs and between racks.
  • Page 33: Figure 1-5 Chassis Manager

    ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide BIOS uses the preboot execution environment (PXE) to PXE boot and it is eth0 to the Linux kernel. The 10/100 Ethernet interface is accessible to the management interface (BMC) built onto each compute blade.
  • Page 34 1: SGI Altix ICE 8000 Series System Overview • Local This is a connection to the leader node at the top of the rack in which this CMC is located. Only one CMC (of the possible four) is connected to the leader node, as shown in Figure 1-2 on page 7.
  • Page 35 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide Includes all 1588_left and 1588_right connections, as well as an internal port to the CMC processor. This VLAN carries all of the IEEE 1588 timing traffic. • VLAN_HEAD Includes all leader_local, leader_left, and leader_right connections.
  • Page 36: Figure 1-6 Vlan_Gbe And Vlan_Bmc Network Connections - Iru View

    1: SGI Altix ICE 8000 Series System Overview Figure 1-6 VLAN_GBE and VLAN_BMC Network Connections - IRU View The VLAN_GBE and VLAN_BMC networks connect the leader node in a given rack with the compute nodes (blades). In the case of VLAN_BMC, the network also connects the CMC with the compute blades and rack leader controller (leader node).
  • Page 37: Figure 1-7 Vlan_Gbe And Vlan_Bmc Network Connections - Rack View

    ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide Admin node Login node RESET RESET Rack 01 Rack 02 Leader node Leader node RESET RESET Figure 1-7 VLAN_GBE and VLAN_BMC Network Connections – Rack View 007–5450–001...
  • Page 38: Figure 1-8 Vlan_Head Network Connections

    RESET Figure 1-8 VLAN_HEAD Network Connections In an SGI Altix ICE system with just one IRU, the CMC’s R58 and L58 ports are assigned to VLAN_HEAD by a field configurable setting. This provides two additional Ethernet ports that can be use to connect service nodes to your system.
  • Page 39 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide InfiniBand Fabric The InfiniBand fabric connects the service nodes, leader nodes, and the compute blades. It does not connect to the admin node or the CMCs. The InfiniBand network has two separate network fabrics, ib0 and ib1.
  • Page 40: Network Interface Naming Conventions

    1: SGI Altix ICE 8000 Series System Overview Leader node Leader node HCA connection HCA connection to switch blade to switch blade Leader node 16 compute nodes InfiniBand connections between IRUs 16 compute nodes Figure 1-9 Two InfiniBand Fabrics in a System with Two IRUs...
  • Page 41: Ethernet Networks

    ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide • "Service Nodes" on page 24 • "Rack Leader Controllers" on page 25 • "Chassis Management Control (CMC) Blade" on page 25 • "Compute Nodes" on page 25...
  • Page 42: System Admin Controller

    1: SGI Altix ICE 8000 Series System Overview Subnet for network filesystems. System Admin Controller The system admin controller (admin node) is the Scali Manage server. Networks implemented are, as follows: • BMC Connected to corporate network. You set the IP address and subnet mask.
  • Page 43: Rack Leader Controllers

    ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide Rack Leader Controllers These are Scali Manage Gateways. Hostname is rXXlead. The rack leader controller (leader node) networks implemented are, as follows: • BMC Connected to the head BMC network (IP 172.17.XX.[1-255]) •...
  • Page 44 1: SGI Altix ICE 8000 Series System Overview • eth0 Connected to the rack network (IP 192.168.0.[11-74], name r[01-xx]i[01-04]n[01-16]-eth0) • ib0 Connected to IB subnet1. (IP 10.0.XX.[11-74], name r[01-xx]i[01-04]n[01-16]) • ib1 Connected to IB subnet2. (IP 10.1.XX.[11-74], name r[01-xx]i[01-04]n[01-16]-ib1) 007–5450–001...
  • Page 45: Getting Started With Scali Manage

    They are NOT for configuring the initially delivered cluster. Installing or Updating Software Scali Manage offers a mechanism to upload and install software across the SGI Altix ICE system. This upload and installation process requires that the software installation be in RPM format. Tarball software distributions can be installed across a cluster.
  • Page 46: Administrative Tips

    Scali GUI are covered in Chapter 3 of the Scali Manage User’s Guide. Customers with support contracts needing BIOS or Firmware updates, should check the SGI Supportfolio Web Page at: https://support.sgi.com/login Administrative Tips This section describes some useful administrative tips and covers these topics: •...
  • Page 47 Scali, ibhost, Intel compilers, MPI runtime libraries, /Factory- ipmitool, and so on Install/Apps CD ISO images of the base OS for installing Scali /Factory- Cluster Manage software Install/ISO Cluster documentation manuals (Scali, PBS /Factory- Professional, Voltaire, SMC, SGI) Install/Docs 007–5450–001...
  • Page 48: Scali Manage Command Cli Help

    Install/Scripts Scali Manage Command CLI Help You can get a help statement for the Scali Manage command line interface (CLI) as shown in the following example: system-1:~ # scalimanage-cli help SGI ---- SGI Altix ICE commands ---- List of commands:...
  • Page 49: Configuring The Scali Manage Server

    Naming Conventions" on page 22) • Add the eth1 and eth1:headbmc interfaces with preset IP addresses • Load the SGI ProPack software stack Defining New Racks or Service Nodes To add one or more racks of compute nodes, perform the following: scalimanage-cli definealtixicerack <racknumbers>...
  • Page 50: Discovering Service And Leader Nodes

    2: Getting Started with Scali Manage • A rack subnet and a rack BMC subnet per rack • Four chassis management controllers (CMCs) per rack • 16 compute blades per CMC per rack Optionally, the number of CMCs per rack and the number of blades per CMC can be specified to define partial Altix ICE rack configurations.
  • Page 51: Discovering Cmcs And Compute Nodes

    • Discover MAC addresses of blades through CMCs Installing Compute Nodes As described in the chapter 1, “SGI Altix ICE 8200 System Overview”, on SGI Altix ICE systems, the InfiniBand network ib1 is to be used for storage traffic, the InfiniBand network ib0 is to be used for MPI traffic, and the Ethernet network is...
  • Page 52 2: Getting Started with Scali Manage 1. The installation of the compute node can either be a direct installation using packages, or can be an installation using an installation image created from another node, such as a service node. If the former method is used, it is possible to use a compute node specific installation template.
  • Page 53 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide Arguments: systemnames - system(s) {[..]} imagename - os image to set To set nodes r01i01n02...r01i01n06 to use the image created from r01i01n02, perform the following: scalimanage-cli setdiskless r01i01n0[2-6] r01i01n02-image1 Run scalimanage-cli reconfigure all to propagate the Scali Manage changes.
  • Page 54 2: Getting Started with Scali Manage You can confirm that Scali Manage knows about the mounts, as follows: scalimanage-cli listremotefs r01i01n0[2-6] 6. Power off or confirm that the compute nodes are powered off. Currently, the Scali Manage GUI power node off or on does not work correctly from the admin node.
  • Page 55: Configuration Session Example

    On SGI Altix ICE System Quick Reference Guide Starting with the nodes powered off, install the compute nodes from the SGI Altix ICE admin node, as follows: scalimanage-cli install r01i01n0[2-6] DHCP requests can be followed on the rack leader nodes, as follows:...
  • Page 56: Displaying Cluster Components

    2: Getting Started with Scali Manage Login to the Scali Manage interface as root, the factory password is sgisgi. Use your system name and log in as root as shown in Figure 2-1 on page 38. Figure 2-1 Example Starting Screen for the Scali Manage GUI Displaying Cluster Components Cluster components are shown in Figure 2-2 on page 39.
  • Page 57: Scali Manage Troubleshooting Tips

    ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide Figure 2-2 Cluster Components Selection Screen Example Scali Manage Troubleshooting Tips This section describes some general guidelines as well as emergency procedures. Whenever a Scali cluster parameter is changed, it is necessary to apply the configuration.
  • Page 58: Compute Node Rpms

    2: Getting Started with Scali Manage There are situations when the GUI does not reflect the cluster configuration properly. Restarting the GUI may solve this problem. In rare cases the Scali product enters an inconsistent state. In this state it shows abnormal behavior and refuses to take any input.
  • Page 59: Compute Node Rpms On Rhel

    ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide libibcm libibcommon libibmad libibumad libibverbs libibverbs-devel libibverbs-utils libmthca libopensm libosmcomp libosmvendor librdmacm librdmacm-utils lkSGI mpitests_mpt msr-tool mstflint numatools ofed-docs ofed-scripts openib-diags Compute Node RPMs on RHEL The following RPMs reside on the compute node when you run Scali Manage on top...
  • Page 60 2: Getting Started with Scali Manage libibcm libibcommon libibmad libibumad libibverbs libibverbs-devel libibverbs-utils libmthca libopensm libosmcomp libosmvendor librdmacm librdmacm-utils lkSGI mpitests_mpt msr-tool mstflint numatools ofed-docs ofed-scripts openib-diags pcp-open perftest rds-tools sgi-arraysvcs sgi-mpt sgi-procset sgi-release sgi-support-tools tvflash xpmem 007–5450–001...
  • Page 61: System Fabric Management

    For background information on OFED, see http://www.openfabrics.org. Fabric management on SGI Altix ICE 8000 series systems uses the OFED 1.2 OpenSM software package. The InfiniBand fabric connects the service nodes, rack leader controllers (leader nodes), and the compute nodes. It does not connect to the system admin controller (admin node) or the chassis management control (CMC) blades.
  • Page 62 3: System Fabric Management • Coherency of the fabric database is handled by sldd-ib[01].sh. You must make sure OSM_HOSTS is configured correctly in the /etc/opensm-ib0.conf or /etc/opensm-ib1.conf configuration files. Note: Currently, the InfiniBand fabric ib0 is reserved for MPI or interprocess communication traffic and the InfiniBand fabric ib1 is reserved for storage.
  • Page 63: Appendix A. Infiniband Fabric Details

    files located in the /etc directory. Note: SGI highly recommends that you do NOT change this variable. If an SM detects a change in the fabric during a light sweep, such as, the addition or deletion of a node, it performs a heavy sweep. The heavy sweep actually changes the fabric configuration to reflect the current state of the system.
  • Page 64 A: InfiniBand Fabric Details # LMC This option specifies the subnet’s LMC value. The number of LIDs assigned to each port is 2^LMC. The LMC value must be in the range 0-7. LMC values > 0 allow multiple paths between ports. LMC values >...
  • Page 65 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide TIMEOUT=200 # OSM_LOG This option defines the log to be the given file. By default the log goes to /tmp/osm.log. For the log to go to standard output use OSM_LOG=stdout.
  • Page 66 A: InfiniBand Fabric Details # OSM_HOSTS The list of all SM’s IP addresses in InfiniBand subnet Used to handover mechanism example OSM_HOSTS="128.162.246.221 128.162.246.42" OSM_HOSTS="none" # OSM_CACHE_DIR OSM_CACHE_DIR="/var/cache/osm/ib0" # CACHE_OPTIONS Cache the given command line options into the file /var/cache/osm/opensm-ib0.opts for use next invocation The cache directory can be changed by the environment variable OSM_CACHE_DIR Set to ’--cache-options’...
  • Page 67 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide # PORT_NUM This option defines HCA’s port number which OpenSM should bind PORT_NUM=1 # ONBOOT To start OpenSM automatically set ONBOOT=yes ONBOOT=yes # MULTI_FABRIC # Allow multiple fabrics (and copies of OpenSM) on the same SM host MULTI_FABRIC=yes Each fabric is addressed by a global unqiue identifier (GUID) and unique HCA port...
  • Page 68: Figure A-1 Two Infiniband Fabrics In A System With Two Irus

    A: InfiniBand Fabric Details Leader node Leader node HCA connection HCA connection to switch blade to switch blade Leader node 16 compute nodes InfiniBand connections between IRUs 16 compute nodes Figure A-1 Two InfiniBand Fabrics in a System with Two IRUs With Scali Manage, the routing engine is chosen automatically based on the number of racks in the system.
  • Page 69: Configuring And Initializing The Infiniband Fabric Manually

    # ping -c 1 r01lead PING r01lead.ice.americas.sgi.com (172.16.0.2) 56(84) bytes of data. 64 bytes from r01lead.ice.americas.sgi.com (172.16.0.2): icmp_seq=1 ttl=64 time=0.127 ms --- r01lead.ice.americas.sgi.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.127/0.127/0.127/0.000 ms # ping -c 1 r2lead PING r2lead.ice.americas.sgi.com (172.16.0.3) 56(84) bytes of data.
  • Page 70 A: InfiniBand Fabric Details 64 bytes from r3lead.ice.americas.sgi.com (172.16.0.4): icmp_seq=1 ttl=64 time=0.129 ms --- r3lead.ice.americas.sgi.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.129/0.129/0.129/0.000 ms # ping -c 1 r4lead PING r4lead.ice.americas.sgi.com (172.16.0.5) 56(84) bytes of data.
  • Page 71 (handover) mechanism on the leader nodes by adding the IP addresses recorded in step 2 to the OSM_HOSTS variable, as follows: OSM_HOSTS="172.16.0.2 172.16.0.3 172.16.0.4 172.16.0.5" 8. For systems with five or more racks, SGI recommends you change the ROUTING_ENGINE variable in both configuration files to lash, as follows: ROUTING_ENGINE="lash"...
  • Page 72 A: InfiniBand Fabric Details 11. Use the the ibnetdiscover command to verify the fabric, as follows: r01lead:/ # ibnetdiscover -l Switch : 0x08006900000000dc ports 24 devid 0xb924 vendid 0x2c9 "MT47396 Infiniscale-III Mellanox Technologies" Switch : 0x08006900000000a4 ports 24 devid 0xb924 vendid 0x2c9 "MT47396 Infiniscale-III Mellanox Technologies" : 0x0030487aa7940000 ports 1 devid 0x6274 vendid 0x2c9 "...
  • Page 73: Useful Utilities And Diagnostics

    Appendix B InfiniBand Fabric Troubleshooting This appendix describes some useful utilities and diagnostics for trouble shooting the InfiniBand fabric. Useful Utilities and Diagnostics The openib-diags package contains useful tools and diagnostic software for Open Fabrics Enterprise Distribution (OFED). This section describes some of these tools. These tools reside on the rack leader controller (leader node) in the /usr/bin directory, as follows: r01lead:~ # cd /usr/bin...
  • Page 74: Ibstat Command

    Capability mask: 0x02510a68 Port GUID: 0x0008f104039881aa The following shows output from the ibstat command after the fabric management software has been started: r01lead:/opt/sgi/sbin # ibstat CA ’mthca0’ CA type: MT25208 (MT23108 compat mode) Number of ports: 2 Firmware version: 4.7.600 Hardware version: a0 007–5450–001...
  • Page 75: Ibstatus Command

    Capability mask: 0x02510a6a Port GUID: 0x0008f104039881aa ibstatus Command You can use the ibstatus (less verbose that ibstat) command to show the link rate, as follows: r01lead:/opt/sgi/sbin # ibstatus Infiniband device ’mthca0’ port 1 status: default gid: fe80:0000:0000:0000:0008:f104:0398:81a9 base lid: sm lid:...
  • Page 76: Perfquery Command

    HCA’s and switch ports. You can also use perfquery to reset HCA and switch port counters. To see a usage statement for the perfquery command, perform the following: r01lead:/opt/sgi/sbin # perfquery --help Usage: perfquery [-d(ebug) -G(uid) -a(ll_ports) -r(eset_after_read) -C ca_name -P ca_port -R(eset_only) -t(imeout) timeout_ms -V(ersion) -h(elp)] [<lid|guid> [[port] [reset_mask]]]...
  • Page 77: Ibnetdiscover Command

    Command The ibnetdiscover command allows you discover the IB fabric. To see a usage statement for the ibnetdiscover command, perform the following: r01lead:/opt/sgi/sbin # ibnetdiscover --help Usage: ibnetdiscover [-d(ebug)] -e(rr_show) -v(erbose) -s(how) -l(ist) -g(rouping) -H(ca_list) -S(witch_list) -V(ersion) -C ca_name -P ca_port -t(imeout) timeout_ms --switch-map switch-map] [<topology-file>]...
  • Page 78: Ibdiagnet Command

    Switch : 0x08006900000000a4 ports 24 devid 0xb924 vendid 0x2c9 "MT47396 Infiniscale-III Mellanox Technologies" r01lead:/opt/sgi/sbin # ibnetdiscover -H (HCA’s) : 0x0030487aa7940000 ports 1 devid 0x6274 vendid 0x2c9 "MT25204 InfiniHostLx Mellanox Technologies" : 0x0030487aa78c0000 ports 1 devid 0x6274 vendid 0x2c9 "r1i0n8-ib0 HCA-1"...
  • Page 79 ® ® Scali Manage On SGI Altix ICE System Quick Reference Guide ibdiagnet.lst - List of all the nodes, ports and links in the fabric ibdiagnet.fdbs - A dump of the unicast forwarding tables of the fabric switches ibdiagnet.mcfdbs - A dump of the multicast forwarding tables of the fabric switches ibdiagnet.masks...
  • Page 80 5 - Failed to use Topology File 6 - Failed to load required Package Output which shows no errors means the system is operating correctly: r01lead:/opt/sgi/sbin # ibdiagnet Loading IBDIAGNET from: /usr/lib64/ibdiagnet1.2 Loading IBDM from: /usr/lib64/ibdm1.2 -W- Topology file is not specified.
  • Page 81 -I- No bad link were found -I- Done. Run time was 0 seconds. You can use ibdiagnet to load the fabric to test it. like this r01lead:/opt/sgi/sbin # ibdiagnet -c 5000 Loading IBDIAGNET from: /usr/lib64/ibdiagnet1.2 Loading IBDM from: /usr/lib64/ibdm1.2 -W- Topology file is not specified.
  • Page 82 B: InfiniBand Fabric Troubleshooting -I--------------------------------------------------- -I- No bad Links (with logical state = INIT) were found -I--------------------------------------------------- -I- PM Counters Info -I--------------------------------------------------- -I- No illegal PM counters values were found -I--------------------------------------------------- -I- Bad Links Info -I--------------------------------------------------- -I- No bad link were found -I- Done.
  • Page 83 RPMs, 40 RHEL, 41 SLES, 40 main power, 4 default configuration Scali MPI, 3 SGI MPT, 3 gateway service node, 11 network interface naming conventions, 22 hardware hierarchy, 5 networks hardware overview, 1 Gigabit Ethernet (GigE) and 10/100 Ethernet...
  • Page 84 Index compute, 10 restarting the InfiniBand fabric after a system gateway, 11 reboot, 43 login service, 10 rack leader controller leader node, 8 storage service, 11 system admin controller Scali Manage management software, 1 admin node, 8 setting up serial over LAN connection, 14 storage service node, 11 system admin controller, 5, 8 system overview, 1...

Table of Contents