Page 2
Contractor/manufacturer is Silicon Graphics, Inc., 1140 East Arques Avenue, Sunnyvale, CA 94085–4602. TRADEMARKS AND ATTRIBUTIONS Silicon Graphics, SGI, Altix and the SGI logo are registered trademarks of SGI., in the United States and/or other countries worldwide. Voltaire is a registered trademark of Voltaire Inc.
Page 3
Record of Revision Version Description -001 March 2007 First publication. -002 April 2007 Updated Scali Manage information to version 5.4. -003 December 2007 Updates of Scali Manage information to version 5.5. and NIC1 IP address change process re-written. -004 March 2008 Updates for Altix XE250 head nodes and XE250 or XE320 compute nodes plus Scali Manage version 5.6 information.
Contents SGI Altix XE1300 Cluster Quick-reference . Overview Site Plan Verification Unpacking and Installing a Cluster Rack Booting the XE1300 Cluster SGI Altix XE250 Head Node Front Controls and Indicators . SGI Altix XE240 Head Node Front Controls and Indicators .
Page 6
Contents Related Publications . . 26 Third-Party Clustering Documents . . 28 Customer Service and Removing Parts . . 29 Contacting the SGI Customer Service Center . 30 Cluster Administration Training from SGI . . 30 Administrative Tips and Adding a Node . .
Page 7
Contents Sensor commands . 56 Displaying all Objects in SDR . . 56 Displaying all Sensors in the System . . 56 Displaying an Individual Sensor . 56 Chassis Commands . . 57 Chassis Identify . . 57 Controlling System Power . .
(upgrade) or troubleshoot the SGI Altix XE1300 cluster. The SGI Altix XE1300 cluster is a set of SGI Altix 1U-high servers (compute nodes), and one or more SGI Altix 2U-high servers (head nodes) networked together, that can run parallel programs using a message passing tool like the Message Passing Interface (MPI).
1: SGI Altix XE1300 Cluster Quick-reference systems have multiple processors (and/or multiple cores) that share memory, but the basic rule is that a process is run on a dedicated processor core. These are the primary hardware component types in the rackmounted cluster: •...
Unpacking and Installing a Cluster Rack Unpacking and Installing a Cluster Rack When your system is housed in a single rack, the cluster components come rackmounted and cabled together and a document describing how to unpack and install the rack should be included with the system.
1: SGI Altix XE1300 Cluster Quick-reference • HDD: Channel activity for the hard disk drive (HDD). This light indicates drive activity on the node board when flashing. • NIC1/NIC2: Indicates network activity on the LAN1 or LAN2 interconnect when flashing.
Page 13
Booting the XE1300 Cluster Table 1-1 (continued) SGI Altix XE240 Head Node Controls and Indicators Callout Feature Description Power/Sleep LED Constant green light indicates the system has power applied to it. Blinking green indicates the system is in S1 sleep state. No light indicates the power is off or is in ACPI S4 or S5 state.
1: SGI Altix XE1300 Cluster Quick-reference Compute Node Controls and Indicators Control panel: Control panel: Node board 1 Node board 2 RESET RESET RESET Power RESET Power LED Overheat/Fan fail LED HDD activity LED NIC 2 activity LED NIC 1 activity LED...
Cluster Configuration Overview Cluster Configuration Overview The following four figures are intended to represent the general types of cluster configurations used with SGI XE1300 systems. Note: These configuration drawings are for informational purposes only and are not meant to represent any specific cluster system. Figure 1-4 on page 8 diagrams a basic Gigabit Ethernet configuration using a single Ethernet switch for node-to-node communication.
Page 16
1: SGI Altix XE1300 Cluster Quick-reference Base Gigabit Ethernet switch for Admin. Compute Node Compute Node Standard RJ-45 twisted-pair cable Compute Node Head Node Remote workstation monitor 1U slide out console Customer Ethernet Figure 1-4 Basic Cluster Configuration Example Using a Single Ethernet Switch...
Page 17
Cluster Configuration Overview Base Gigabit Ethernet Base Gigabit Ethernet switch for Admin. switch (MPI) Compute Node Compute Node Standard RJ-45 twisted-pair cable Compute Node Head Node Remote workstation monitor GigE PCI card 1U slide out console Customer Ethernet Figure 1-5 Dual-Ethernet Switch Based Cluster Example 007-4979-004...
Page 18
1: SGI Altix XE1300 Cluster Quick-reference InfiniBand Base Gigabit Ethernet switch (MPI) switch for Admin. Compute Node InfiniBand cables Compute Node Standard RJ-45 twisted-pair Compute Node cable 1U slide out Remote workstation Head Node console monitor InfiniBand PCI card Customer Ethernet...
Page 19
Cluster Configuration Overview InfiniBand switch (MPI) Gigabit Ethernet switch for NAS Base Gigabit Ethernet switch for Admin. Standard RJ-45 twisted-pair cable Compute Node InfiniBand cables Compute Node Standard RJ-45 twisted-pair Compute Node cable 1U slide out Remote workstation Head Node console monitor InfiniBand...
1: SGI Altix XE1300 Cluster Quick-reference Power Down the Cluster Note: You can also use the baseboard management controller (BMC) interface to perform power management and other administrative functions. Refer to the Altix XE310 User’s Guide, publication number 007-4960-00x, for more information about the BMC interface. See the SGI Altix XE320 User’s Guide, publication number 007-5466-00x for information on its BMC.
Powering Off Manually Powering Off Manually To power off your cluster system manually, follow these steps: Caution: If you power off the cluster before you halt the operating system, you can lose data. Shut down the operating system by entering the following command: # init 0 2.
1: SGI Altix XE1300 Cluster Quick-reference Ethernet Network Interface Card (NIC) Guidelines While Ethernet ports are potentially variable in a cluster, the following rules generally apply to the cluster head node: • The server motherboard’s nic1 is always a public IP in the head node.
Changing the NIC1 (Customer Domain) IP Address Table 1-3 Head Node Ethernet Address Listings Internal (GigEnet) MPI Baseboard Head node management IP NAS/SAN option Infiniband IP Management Control number address nic2 nic3 address or IPMI address nic1 10.0.10.1 172.16.10.1 192.168.10.1 10.0.30.1 10.0.10.2 172.16.10.2...
Page 24
1: SGI Altix XE1300 Cluster Quick-reference – Click the “IP Address” box for device eth0 and change the IP address – Click the “Subnet” box for each network and select (arrow) the new subnet 9. Click in the “Default Gateway” tab. Click on the “Gateway IP Address” and change it to your network address 10.
Cluster Compute Node IP Addresses Cluster Compute Node IP Addresses The cluster system can have multiple compute nodes that each use up to three IP address points (plus the Infiniband IP address). As with the head nodes, each fourth octet number in an address iterates by one number as a compute node is added to the list.
1: SGI Altix XE1300 Cluster Quick-reference Web or Telnet Access to Maintenance Port on the Gigabit Ethernet Switch Your switch(s) setup is configured in the factory before shipment and should be accessible via telnet or a web browser. The switch can be a single switch or a stacked master/slave combination.
Switch Connect and IP Address Serial Access to the SMC Switch Use of a serial interface to the switch should only be needed if the factory assigned IP address for the switch has been somehow deleted, altered or corrupted. Otherwise, use of the web or telnet access procedure is recommended.
1: SGI Altix XE1300 Cluster Quick-reference InfiniBand Switch Connect and IP Address The subsection “Web or Telnet Access to the InfiniBand Switch” on page 20 lists the factory IP address settings for your InfiniBand switch or switch “stack” used with the cluster. For clusters with greater than 288 network ports, consult SGI Professional Services for specific IP address configuration information.
InfiniBand Switch Connect and IP Address Serial Access to the Switch You should connect a Voltaire serial cable (either DV-9 to DB-9 or DB-9 to DB-9) that comes with the 24-port switch, from a PC/laptop directly to the switch for serial access. Use of a serial interface to the switch should only be needed if the factory assigned IP address for the switch has been somehow deleted, altered or corrupted.
1: SGI Altix XE1300 Cluster Quick-reference 3. Set up the network for your InfiniBand switch cluster configuration using the following information and the IP reference provided in “Web or Telnet Access to the InfiniBand Switch” on page 20. Enter the following commands to set up the network:...
Installing or Updating Software Installing or Updating Software Scali Manage offers a mechanism to upload and install software across the cluster. This upload and installation process requires that the software installation be in RPM format. Tarball software distributions can be installed across a cluster. Please see the Scali scarcp (cluster remote copy) and the scash (cluster remote shell) commands in the Scali Manage User’s Guide.
1: SGI Altix XE1300 Cluster Quick-reference Note: The DEL key and F2 key work only if the proper ACSII terminal settings are in place. Many Linux distributions default to varied ASCII settings. In the case of the SGI Altix XE310 or XE320 compute node, or the Altix XE250 head node, the DEL key should always generate an “ACSII DEL”.
NFS Quick Reference Points In rare cases the Scali product enters an inconsistent state. In this state it shows abnormal behavior and refuses to take any input. In this case try to reinitialize the head node via /etc/init.d/scance restart. This command must be run on the head node. If this does not change Scali’s state, then you should reboot the head node.
• SGI Altix XE320 User’s Guide (P/N 007-5466-00x) This guide covers general operation, configuration, and servicing of the SGI Altix XE320 compute modules within the SGI Altix XE1300 cluster. • SGI Altix XE310 User’s Guide (P/N 007-4960-00x) This guide covers general operation, configuration, and servicing of the SGI Altix XE310 compute modules within the SGI Altix XE1300 cluster.
Page 35
Related Publications • SGI Altix® Systems Dual-Port Gigabit Ethernet Board User's Guide, Publication Number 007-4326-00x This guide describes the two versions of the optional SGI dual-port Gigabit Ethernet board, shows you how to connect the boards to an Ethernet network, and explains how to operate the boards.
1: SGI Altix XE1300 Cluster Quick-reference Third-Party Clustering Documents The SGI Altix XE1300 Cluster is provided in different configurations and not all the third-party documents listed here will be applicable to every system. Note that Linux is the only operating system supported with the SGI Altix XE1300 cluster.
Customer Service and Removing Parts • SMC® TigerStack™ II Gigabit Ethernet Switch Management Guide, Use this guide to manage the operations of your SMC8824M 24-port switch or SMC8848M 48-port switch. • Scali Manage™ User’s Guide, This document provides an overview of a Scali system in terms of instructions for building a Scali system.
1: SGI Altix XE1300 Cluster Quick-reference Contacting the SGI Customer Service Center To contact the SGI Customer Service Center, call 1-800-800-4SGI, or visit: http://www.sgi.com/support/customerservice.html From outside the United States contact your local SGI sales office. To reach SGI for other purposes, use the following contact information: SGI Corporate Office 1140 E.
Chapter 2 Administrative Tips and Adding a Node This chapter provides general administrative information section and information on starting and using the Scali Manage GUI to add a node in a Scali managed cluster. For information on using the Scali Manage command line interface to add a node, refer to the Scali Manage User’s Guide. Basic information on starting Scali Manage, administrative passwords and factory installed files and scripts are covered in the first section of this chapter, “Administrative Tips”...
2: Administrative Tips and Adding a Node Administrative Tips Root password and administrative information includes: • Root password = sgisgi (head node and compute nodes) • Ipmitool user/password info: User = admin Password = admin Refer to Table 1-3 on page 15 and Table 1-4 on page 17 for listings of the IPMI IP addresses for nodes.
Page 41
Administrative Tips The Scali Manage installer directory (/usr/local/Scali###) is the location of the code used to install Scali Cluster management Software. The Factory-Install directory is located on the head node server at /usr/local/Factory-Install. The /Factory-Install directory contains software files that support the cluster integration and many files and scripts that may be helpful, including: Under /usr/local/ /Factory-Install/Apps Scali, ibhost, Intel compilers, MPI runtime libraries, ipmitool, etc.
2: Administrative Tips and Adding a Node Start the Scali Manage GUI Login to the Scali Manage interface as root, the factory password is sgisgi. Use your system name and log in as root. Refer to Figure 2-1 for an example. Figure 2-1 Example Starting Screen for the Scali Manage GUI 007-4979-004...
Head Node Information Screen Head Node Information Screen You can view and confirm the head node information from the main GUI screen. Click on the node icon (cl1n001 in the example below) for name and subnet information on your cluster head node. Figure 2-2 Head Node Information Screen Example 007-4979-004...
2: Administrative Tips and Adding a Node Adding a Node Starting from the Main GUI Screen Add a node when you need to upgrade. To add a cluster node, open the Clusters tree by clicking the right mouse button. Move your cursor over the cluster tree (cluster cl1 in the example screen), and click the right mouse button.
Adding a Cluster Compute Node Adding a Cluster Compute Node These steps should only be taken if the cluster needs to be upgraded or re-created. Select the option “Extend existing cluster” and provide the number of new servers (2 in the example). Then select the “Cluster Name”...
2: Administrative Tips and Adding a Node Selecting the Server Type Click on “Edit” to bring up the “Node Hardware Configuration” network panel. Scroll down the menu and select the server type you are adding. Then enter the BMC user ID (admin) and the password (admin).
Network BMC Configuration Network BMC Configuration Click on the “Edit” button. Assign the new BMC IP address, stepping and BMC host name. Click OK when the appropriate information is entered. Click “Next” to move to the following screen. Figure 2-6 BMC Network Configuration Screen Example 007-4979-004...
2: Administrative Tips and Adding a Node Select Preferred Operating System Click on the option to select the new node’s operating system. Enter the sgisgi factory password or whatever new password may have been assigned. Click “Next” to move to the following screen. Figure 2-7 Preferred Operating System Screen Selection Example 007-4979-004...
Node Network Configuration Screen Node Network Configuration Screen Use this screen to assign Ethernet 0 (eth0) as your network interface port. Fill in the additional information as it applies to your local network. Click “OK” to continue. Figure 2-8 Node Network (Ethernet 0) Screen Example 007-4979-004...
Page 50
2: Administrative Tips and Adding a Node Enter the default gateway information (refer to Figure 2-9) and select “Next” to continue. Figure 2-9 Default Gateway Example Screen 007-4979-004...
DNS and NTP Configuration Screen DNS and NTP Configuration Screen This screen extracts the name server numbers for use with the system configuration files. In this example, the domain name is engr.sgi.com with NTP enabled. Click “Next” when complete. Figure 2-10 DNS and NTP Configuration Screen Example 007-4979-004...
2: Administrative Tips and Adding a Node NIS Configuration Screen This screen allows you to specify, enable or disable a Network Information Service (NIS) for the new node. Assign your domain name (see Figure 2-11 for an example) and click “Next” to go to the following screen.
Scali Manage Options Screen Scali Manage Options Screen This screen provides the options shown, including installation of MPI, your software version, monitor options and more. Click “Next” to move to the following screen. Figure 2-12 Scali Manage Options Screen Example 007-4979-004...
2: Administrative Tips and Adding a Node Configuration Setup Complete Screen This screen allows you to install the operating system and Scali Manage immediately, or store the configuration for later use. Click “Finish” after you make your selection. Figure 2-13 Configuration Setup Complete Screen Example 007-4979-004...
Checking the Log File Entries (Optional) Checking the Log File Entries (Optional) You can check the log file entries during configuration of the new node(s) to confirm that a log file has been created and to view the entries. Figure 2-14 Optional Log File Screen Example 007-4979-004...
2: Administrative Tips and Adding a Node Setting a Node Failure Alarm on Scali Manage This section shows how to create an alarm using a “Node Down” alarm as an example: Start the GUI. Refer to “Start the Scali Manage GUI” on page 34 if needed. 2.
Page 57
Setting a Node Failure Alarm on Scali Manage 6. At this time you must enter the criteria that trigger the alarm. Click on “Add Criteria” (refer to Figure 2-16.) Figure 2-16 Add Criteria Screen Example 7. Another popup presents itself. For this example we picked a “Filter” criteria for the node status.
Page 58
2: Administrative Tips and Adding a Node Figure 2-17 Define Chart Data Popup Example (Filter Selected) Next we need to choose the priority for this alarm. The example assigns a critical priority for the “Node Down” alarm. We want this alarm to be triggered at most once. Therefore we leave the “Re-Trigger”...
Page 59
Setting a Node Failure Alarm on Scali Manage email to a system administrator or e-mail alias. You must pick the appropriate action and supply the e-mail address. Figure 2-18 Applying the Alarm Example Screen 007-4979-004...
Page 60
2: Administrative Tips and Adding a Node To illustrate how an alarm makes it’s appearance we have intentionally brought down the node. A few seconds thereafter the GUI indicates a node failure by changing the node icon in the cluster tree, refer to Figure 2-19.
Chapter 3 IPMI Commands Overview This chapter provides a set of example IPMI commands, and is not meant to be a comprehensive guide in the use of ipmitool. Its purpose is to briefly describe some of the commonly used IPMI commands to help you get started with your cluster administration.
3: IPMI Commands Overview User Administration BMC Supports multiple users, username/password is required for remote connections. The cluster is shipped with a factory username and password set on user id 2: Username = admin Password = admin Typical ipmitool Command Line oemtype bmc_ip_address ipmitool –I lanplus –o <...
Serial-over-lan Commands opts ipmitool < > lan set 1 netmask x.x.x.x opts impitool < > lan set 1 arp respond on opts impitool < > lan set 1 arp generate on To check your lan settings: impitool <opts> lan print 1 Serial-over-lan Commands Serial-Over-Lan (SOL) comes preconfigured and enabled on each node of your cluster.
3: IPMI Commands Overview Connecting to Node Console via SOL opts ipmitool < > sol activate Deactivating an SOL Connection In certain cases using the Scali Manage GUI to access a console, you may need to deactivate the SOL connection from the command line to free up the SOL session. opts ipmitool <...
Page 65
Chassis Commands Chassis Commands Use the following chassis commands to administer the cluster. Note that you can also use the BMC interface to perform chassis power commands on cluster nodes. Chassis Identify Note: The following command works only on SGI Altix XE240 ipmitool chassis identify head nodes.
Need help?
Do you have a question about the Altix XE1300 and is the answer not in the manual?
Questions and answers