Red Hat CLUSTER MANAGER - INSTALLATION AND Administration Manual

page of 190

/ 190
Contents
Table of Contents
Bookmarks

Table of Contents

Quick Links

Download this manual

Red Hat Cluster Manager

The Red Hat Cluster Manager Installation and

Administration Guide

Table of Contents

Need help?

Do you have a question about the CLUSTER MANAGER - INSTALLATION AND and is the answer not in the manual?

Questions and answers

Summary of Contents for Red Hat CLUSTER MANAGER - INSTALLATION AND

Page 1 Red Hat Cluster Manager The Red Hat Cluster Manager Installation and Administration Guide...
Page 2 ISBN: N/A Red Hat, Inc. 1801 Varsity Drive Raleigh, NC 27606 USA +1 919 754 3700 (Voice) +1 919 754 3701 (FAX) 888 733 4281 (Voice) P.O. Box 13588 Research Triangle Park, NC 27709 USA © 2002 Red Hat, Inc. ©...
Page 3 Acknowledgments The Red Hat Cluster Manager software was originally based on the open source Kimberlite http://oss.missioncriticallinux.com/kimberlite/ cluster project which was developed by Mission Critical Linux, Inc. Subsequent to its inception based on Kimberlite, developers at Red Hat have made a large number of enhancements and modifications.
Page 4: Table Of Contents
Contents Red Hat Cluster Manager Acknowledgments ............... . iii Chapter 1 Introduction to Red Hat Cluster Manager ..
Page 5: Database
Chapter 5 Database Services ..........83 Setting Up an Oracle Service ........... 83 Tuning Oracle Services ..
Page 6 Appendix A Supplementary Hardware Information ....151 Setting Up Power Switches............151 SCSI Bus Configuration Requirements ........160 SCSI Bus Termination.
Page 7: Chapter 1 Introduction To Red Hat Cluster Manager
Section 1.1:Cluster Overview 1 Introduction to Red Hat Cluster Manager The Red Hat Cluster Manager is a collection of technologies working together to provide data in- tegrity and the ability to maintain application availability in the event of a failure. Using redundant hardware, shared disk storage, power management, and robust cluster communication and application failover mechanisms, a cluster can meet the needs of the enterprise market.
Page 8 Chapter 1:Introduction to Red Hat Cluster Manager Figure 1–1 Example Cluster Figure 1–1, Example Cluster shows an example of a cluster in an active-active configuration. If a hardware or software failure occurs, the cluster will automatically restart the failed system’s ser- vices on the functional cluster system.
Page 9: Cluster Features
Section 1.2:Cluster Features 1.2 Cluster Features A cluster includes the following features: • No-single-point-of-failure hardware configuration Clusters can include a dual-controller RAID array, multiple network and serial communication channels, and redundant uninterruptible power supply (UPS) systems to ensure that no single fail- ure results in application down time or loss of data.
Page 10 Chapter 1:Introduction to Red Hat Cluster Manager the two systems from simultaneously accessing the same data and corrupting it. Although not required, it is recommended that power switches are used to guarantee data integrity under all failure conditions. Watchdog timers are an optional variety of power control to ensure correct operation of service failover.
Page 11 Section 1.2:Cluster Features Figure 1–2 Cluster Communication Mechanisms Figure 1–2, Cluster Communication Mechanisms shows how systems communicate in a cluster configuration. Note that the terminal server used to access system consoles via serial ports is not a required cluster component. •...
Page 12: How To Use This Manual
Chapter 1:Introduction to Red Hat Cluster Manager • Manual service relocation capability In addition to automatic service failover, a cluster enables administrators to cleanly stop services on one cluster system and restart them on the other system. This allows administrators to perform planned maintenance on a cluster system, while providing application and data availability.
Page 13: Chapter 2 Hardware Installation And Operating System Configuration
Section 2.1:Choosing a Hardware Configuration 2 Hardware Installation and Operating System Configuration To set up the hardware configuration and install the Linux distribution, follow these steps: • Choose a cluster hardware configuration that meets the needs of applications and users, see Section 2.1, Choosing a Hardware Configuration.
Page 14: Shared Storage
Chapter 2:Hardware Installation and Operating System Configuration Cost restrictions The hardware configuration chosen must meet budget requirements. For example, systems with multiple I/O ports usually cost more than low-end systems with less expansion capabilities. Availability requirements If a computing environment requires the highest degree of availability, such as a production en- vironment, then a cluster hardware configuration that protects against all single points of failure, including disk, storage interconnect, heartbeat channel, and power failures is recommended.
Page 15: Shared Disk Storage
RAID units are based on parallel SCSI buses. These products typically do not allow for online repair of a failed system. No host RAID adapters are currently certified with Red Hat Cluster Manager. Refer to the Red Hat web site at http://www.redhat.com for the most up-to-date supported hardware matrix.
Page 16 Chapter 2:Hardware Installation and Operating System Configuration Problem Solution Power source failure Redundant uninterruptible power supply (UPS) systems. Data corruption under all failure Power switches or hardware-based watchdog timers conditions A no-single-point-of-failure hardware configuration that guarantees data integrity under all failure conditions can include the following components: •...
Page 17 Section 2.1:Choosing a Hardware Configuration 2.1.3 Choosing the Type of Power Controller The Red Hat Cluster Manager implementation consists of a generic power management layer and a set of device specific modules which accommodate a range of power management types. When se- lecting the appropriate type of power controller to deploy in the cluster, it is important to recognize the implications of specific device types.
Page 18 Chapter 2:Hardware Installation and Operating System Configuration with a power controller type of "None" is useful for simple evaluation purposes, but because it affords the weakest data integrity provisions, it is not recommended for usage in a production environment. Ultimately, the right type of power controller deployed in a cluster environment depends on the data integrity requirements weighed against the cost and availability of external power switches.
Page 19: Cluster System Hardware
The complete set of qualified cluster hardware components change over time. Consequently, the table below may be incomplete. For the most up-do-date itemization of supported hardware components, refer to the Red Hat documentation website at http://www.redhat.com/docs. Table 2–3 Cluster System Hardware Table...
Page 20: Setting Up
Chapter 2:Hardware Installation and Operating System Configuration Table 2–4 Power Switch Hardware Table Hardware Quantity Description Required Serial power Power switches enable each cluster Strongly switches system to power-cycle the other cluster recommended system. See Section 2.4.2, Configuring for data Power Switches for information about integrity using power switches in a cluster.
Page 21 Section 2.1:Choosing a Hardware Configuration Hardware Quantity Description Required Network Network attached power switches Strongly power switch enable each cluster member to power recommended cycle all others. Refer to Section for data 2.4.2, Configuring Power Switches for integrity information about using network attached under all power switches, as well as caveats failure...
Page 22 Chapter 2:Hardware Installation and Operating System Configuration Table 2–5 Shared Disk Storage Hardware Table Hardware Quantity Description Required External Use Fibre Channel or single-initiator disk storage parallel SCSI to connect the cluster enclosure systems to a single or dual-controller RAID array. To use single-initiator buses, a RAID controller must have multiple host ports and provide simultaneous access to all the logical units on the host...
Page 23 At the time of publication, there were no fully tested host-bus adapter based RAID cards. Refer to http://www.redhat.com for more the latest hardware information. SCSI cable SCSI cables with 68 pins connect each host bus Only for par- adapter to a storage enclosure port.
Page 24: Network Hardware
Chapter 2:Hardware Installation and Operating System Configuration Hardware Quantity Description Required SCSI For a RAID storage enclosure that uses "out" Only for par- terminator ports (such as FlashDisk RAID Disk Array) allel SCSI and is connected to single-initiator SCSI buses, configura- connect terminators to the "out"...
Page 25: Point-To-Point Ethernet Heartbeat Channel
Section 2.1:Choosing a Hardware Configuration Table 2–7 Point-To-Point Ethernet Heartbeat Channel Hardware Table Hardware Quantity Description Required Network Two for each Each Ethernet heartbeat channel requires a interface channel network interface installed in both cluster systems. Network One for each A network crossover cable connects a network Only for a crossover...
Page 26: Console Switch Hardware
Chapter 2:Hardware Installation and Operating System Configuration Table 2–8 Point-To-Point Serial Heartbeat Channel Hardware Table Hardware Quantity Description Required Serial card Two for each Each serial heartbeat channel requires serial channel a serial port on both cluster systems. To expand your serial port capacity, you can use multi-port serial PCI cards.
Page 27: Ups System Hardware
Section 2.1:Choosing a Hardware Configuration Table 2–10 UPS System Hardware Table Hardware Quantity Description Required UPS system One or two Uninterruptible power supply (UPS) Strongly systems protect against downtime if recommended a power outage occurs. UPS systems are highly recommended for cluster availability operation.
Page 28: No-Single-Point-Of-Failure Configuration
Chapter 2:Hardware Installation and Operating System Configuration Hardware Quantity RAID storage enclosure The RAID storage enclosure contains one controller with at least two host ports. Two HD68 SCSI cables Each cable connects one HBA to one port on the RAID controller, creating two single-initiator SCSI buses.
Page 29 Section 2.1:Choosing a Hardware Configuration Hardware Quantity One network crossover cable A network crossover cable connects a network interface on one cluster system to a network interface on the other system, creating a point-to-point Ethernet heartbeat channel. Two RPS-10 power switches Power switches enable each cluster system to power-cycle the other system before restarting its services.
Page 30: Steps For Setting Up The Cluster Systems
Chapter 2:Hardware Installation and Operating System Configuration Figure 2–1 No-Single-Point-Of-Failure Configuration Example 2.2 Steps for Setting Up the Cluster Systems After identifying the cluster hardware components described in Section 2.1, Choosing a Hardware Configuration, set up the basic cluster system hardware and connect the systems to the optional con- sole switch and network switch or hub.
Page 31: Installing The Basic System Hardware
Section 2.2:Steps for Setting Up the Cluster Systems 2.2.1 Installing the Basic System Hardware Cluster systems must provide the CPU processing power and memory required by applications. It is recommended that each system have a minimum of 450 MHz CPU speed and 256 MB of memory. In addition, cluster systems must be able to accommodate the SCSI or FC adapters, network inter- faces, and serial ports that the hardware configuration requires.
Page 32 Chapter 2:Hardware Installation and Operating System Configuration devices on one channel and the shared disks on the other channel. Using multiple SCSI cards is also possible. See the system documentation supplied by the vendor for detailed installation information. See Ap- pendix A, Supplementary Hardware Information for hardware-specific information about using host bus adapters in a cluster.
Page 33: Steps For Installing And Configuring The Red Hat Linux Distribution
Section 2.3:Steps for Installing and Configuring the Red Hat Linux Distribution Set up the console switch according to the documentation provided by the vendor. After the console switch has been set up, connect it to each cluster system. The cables used depend on the type of console switch.
Page 34 Chapter 2:Hardware Installation and Operating System Configuration • Use the cat /proc/devices command to display the devices configured in the kernel. See Section 2.3.5, Displaying Devices Configured in the Kernel for more information about performing this task. 8. Verify that the cluster systems can communicate over all the network interfaces by using the ping command to send test packets from one system to the other.
Page 35 Section 2.3:Steps for Installing and Configuring the Red Hat Linux Distribution • Do not place local file systems, such as /, /etc, /tmp, and /var on shared disks or on the same SCSI bus as shared disks. This helps prevent the other cluster member from accidentally mounting these file systems, and also reserves the limited number of SCSI identification numbers on a bus for cluster disks.
Page 36 Chapter 2:Hardware Installation and Operating System Configuration point heartbeat connection on each cluster system (ecluster2 and ecluster3) as well as the IP alias clusteralias used for remote cluster monitoring. Verify correct formatting of the local host entry in the /etc/hosts file to ensure that it does not include non-local systems in the entry for the local host.
Page 37: Displaying Console Startup Messages
Section 2.3:Steps for Installing and Configuring the Red Hat Linux Distribution To modify the kernel boot timeout limit for a cluster system, edit the /etc/lilo.conf file and specify the desired value (in tenths of a second) for the timeout parameter. The following example sets the timeout limit to three seconds: timeout = 30 To apply any changes made to the /etc/lilo.conf file, invoke the /sbin/lilo command.
Page 38: Displaying Devices Configured In The Kernel
Chapter 2:Hardware Installation and Operating System Configuration May 22 14:02:11 storage3 kernel: Detected scsi disk sde at scsi1, channel 0, id 3, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdf at scsi1, channel 0, id 8, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdg at scsi1, channel 0, id 9, lun 0...
Page 39: Steps For Setting Up And Connecting The Cluster Hardware
Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware 19 ttyC 20 cub 128 ptm 136 pts 162 raw Block devices: 2 fd 3 ide0 8 sd 65 sd The previous example shows: • Onboard serial ports (ttyS) • Serial expansion card (ttyC) •...
Page 40 Chapter 2:Hardware Installation and Operating System Configuration 4. Set up the shared disk storage according to the vendor instructions and connect the cluster systems to the external storage enclosure.See Section 2.4.4, Configuring Shared Disk Storage for more information about performing this task. In addition, it is recommended to connect the storage enclosure to redundant UPS systems.
Page 41: Power Switches
Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware To set up a redundant Ethernet heartbeat channel, use a network crossover cable to connect a network interface on one cluster system to a network interface on the other cluster system. To set up a serial heartbeat channel, use a null modem cable to connect a serial port on one cluster system to a serial port on the other cluster system.
Page 42 Chapter 2:Hardware Installation and Operating System Configuration If power switches are not used in cluster, and a cluster system determines that a hung system is down, it will set the status of the failed system to DOWN on the quorum partitions, and then restart the hung system’s services.
Page 43 Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware It is not recommended to use a large UPS infrastructure as the sole source of power for the cluster. A UPS solution dedicated to the cluster itself allows for more flexibility in terms of manageability and availability.
Page 44 Chapter 2:Hardware Installation and Operating System Configuration Figure 2–4 Single UPS System Configuration Many vendor-supplied UPS systems include Linux applications that monitor the operational status of the UPS system through a serial port connection. If the battery power is low, the monitoring software will initiate a clean system shutdown.
Page 45 Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware Multi-initiator SCSI configurations are not supported due to the difficulty in obtaining proper bus termination. • The Linux device name for each shared storage device must be the same on each cluster system. For example, a device named /dev/sdc on one cluster system must be named /dev/sdc on the other cluster system.
Page 46: Single-Initiator Scsi Bus
Chapter 2:Hardware Installation and Operating System Configuration two single-initiator SCSI buses to connect each cluster system to the RAID array is possible. If a log- ical unit can fail over from one controller to the other, the process must be transparent to the operating system.
Page 47 Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware Figure 2–6 Single-Controller RAID Array Connected to Single-Initiator SCSI Buses Figure 2–7 Dual-Controller RAID Array Connected to Single-Initiator SCSI Buses...
Page 48 Chapter 2:Hardware Installation and Operating System Configuration Setting Up a Fibre Channel Interconnect Fibre Channel can be used in either single-initiator or multi-initiator configurations A single-initiator Fibre Channel interconnect has only one cluster system connected to it. This may provide better host isolation and better performance than a multi-initiator bus. Single-initiator inter- connects ensure that each cluster system is protected from disruptions due to the workload, initializa- tion, or repair of the other cluster system.
Page 49 Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware Figure 2–9 Dual-Controller RAID Array Connected to Single-Initiator Fibre Channel Interconnects If a dual-controller RAID array with two host ports on each controller is used, a Fibre Channel hub or switch is required to connect each host bus adapter to one port on both controllers, as shown in Figure 2–9, Dual-Controller RAID Array Connected to Single-Initiator Fibre Channel Interconnects.
Page 50 Chapter 2:Hardware Installation and Operating System Configuration partition. Data consistency is maintained through checksums and any inconsistencies between the partitions are automatically corrected. If a system is unable to write to both quorum partitions at startup time, it will not be allowed to join the cluster.
Page 51 Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware 1. Invoke the interactive fdisk command, specifying an available shared disk device. At the prompt, specify the p command to display the current partition table. # fdisk /dev/sde Command (m for help): p Disk /dev/sde: 255 heads, 63 sectors, 2213 cylinders Units = cylinders of 16065 * 512 bytes Device...
Page 52 Chapter 2:Hardware Installation and Operating System Configuration Syncing disks. 7. If a partition was added while both cluster systems are powered on and connected to the shared storage, reboot the other cluster system in order for it to recognize the new partition. After partitioning a disk, format the partition for use in the cluster.
Page 53 Section 2.4:Steps for Setting Up and Connecting the Cluster Hardware # service rawdevices restart Query all the raw devices by using the command raw -aq: # raw -aq /dev/raw/raw1 bound to major 8, minor 17 /dev/raw/raw2 bound to major 8, minor 18 Note that, for raw devices, there is no cache coherency between the raw device and the block device.
Page 54 Chapter 2:Hardware Installation and Operating System Configuration...
Page 55: Chapter 3 Cluster Software Installation And Configuration
Section 3.1:Steps for Installing and Initializing the Cluster Software 3 Cluster Software Installation and Configuration After installing and configuring the cluster hardware, the cluster system software can be installed. The following sections describe installing and initializing of cluster software, checking cluster configura- tion, configuring syslog event logging, and using the cluadmin utility.
Page 56 Chapter 3:Cluster Software Installation and Configuration • Number of heartbeat connections (channels), both Ethernet and serial • Device special file for each heartbeat serial line connection (for example, /dev/ttyS1) • IP host name associated with each heartbeat Ethernet interface • IP address for remote cluster monitoring, also referred to as the "cluster alias".
Page 57 Section 3.1:Steps for Installing and Initializing the Cluster Software 3.1.1 Editing the rawdevices File The /etc/sysconfig/rawdevices file is used to map the raw devices for the quorum parti- tions each time a cluster system boots. As part of the cluster software installation procedure, edit the rawdevices file on each cluster system and specify the raw character devices and block devices for the primary and backup quorum partitions.
Page 58 Chapter 3:Cluster Software Installation and Configuration While running cluconfig, you will be prompted as to whether or not you wish to configure a cluster alias. This appears as the following prompt: Enter IP address for cluster alias [NONE]: 172.16.33.105 As shown above, the default value is set to NONE, which means that there is no cluster alias, but the user overrides this default and configures an alias using an IP address of 172.16.33.105.
Page 59: Information
Section 3.1:Steps for Installing and Initializing the Cluster Software /sbin/cluconfig Red Hat Cluster Manager Configuration Utility (running on storage0) - Configuration file exists already. Would you like to use those prior settings as defaults? (yes/no) [yes]: yes Enter cluster name [Development Cluster]: Enter IP address for cluster alias [10.0.0.154]: 10.0.0.154 -------------------------------- Information for Cluster Member 0...
Page 60 Chapter 3:Cluster Software Installation and Configuration Enter hostname of the cluster member on heartbeat channel 0 \ [storage1]: storage1 Looking for host storage1 (may take a few seconds)... Information about Quorum Partitions Enter Primary Quorum Partition [/dev/raw/raw1]: /dev/raw/raw1 Enter Shadow Quorum Partition [/dev/raw/raw2]: /dev/raw/raw2 Information About the Power Switch That Power Cycles Member ’storage1’...
Page 61 Section 3.1:Steps for Installing and Initializing the Cluster Software Primary quorum partition: /dev/raw/raw1 Shadow quorum partition: /dev/raw/raw2 Heartbeat channels: 1 Channel type: net, Name: storage1 Power switch IP address or hostname: storage1 Identifier on power controller for member storage1: storage1 -------------------------- Power Switch 0 Information --------------------------...
Page 62: Checking The Cluster Configuration
Chapter 3:Cluster Software Installation and Configuration 3.2 Checking the Cluster Configuration To ensure that the cluster software has been correctly configured, use the following tools located in the /sbin directory: • Test the quorum partitions and ensure that they are accessible. Invoke the cludiskutil utility with the -t option to test the accessibility of the quorum par- titions.
Page 63 Section 3.2:Checking the Cluster Configuration /sbin/cludiskutil -p ----- Shared State Header ------ Magic# = 0x39119fcd Version = 1 Updated on Thu Sep 14 05:43:18 2000 Updated by node 0 -------------------------------- fields will be the same for all cluster configurations. The last two lines of Magic# Version output indicate the date that the quorum partitions were initialized with cludiskutil -I, and the...
Page 64 Chapter 3:Cluster Software Installation and Configuration invoked, it checks the status of the cluster software. If the cluster software is running, the command exits with a message to stop the cluster software. The format of the clustonith command is as follows: clustonith [-sSlLvr] [-t devicetype] [-F options-file] \ [-p stonith-parameters] Options:...
Page 65: Configuring Syslog Event Logging
Section 3.3:Configuring syslog Event Logging – Verify that the network connection to network-based switches is operational. Most switches have a link light that indicates connectivity. – It should be possible to ping the network switch; if not, then the switch may not be properly configured for its network parameters.
Page 66 Chapter 3:Cluster Software Installation and Configuration The importance of an event determines the severity level of the log entry. Important events should be investigated before they affect cluster availability. The cluster can log messages with the following severity levels, listed in order of severity level: •...
Page 67: Using The Cluadmin Utility
Section 3.4:Using the cluadmin Utility After configuring the cluster software, optionally edit the /etc/syslog.conf file to enable the cluster to log events to a file that is different from the default log file, /var/log/messages. The cluster utilities and daemons log their messages using a syslog tag called local4. Using a cluster- specific log file facilitates cluster monitoring and problem solving.
Page 68 Chapter 3:Cluster Software Installation and Configuration If another user holds the lock, a warning will be displayed indicating that there is already a lock on the database. The cluster software allows for the option of taking the lock. If the lock is taken by the current requesting user, the previous holder of the lock can no longer modify the cluster database.
Page 69: Cluadmin Commands
Section 3.4:Using the cluadmin Utility Table 3–1 cluadmin Commands clu- admin cluadmin Com- Subcom- mand mand Description Example help None Displays help for the specified help service add cluadmin command or subcommand. cluster status Displays a snapshot of the current cluster status cluster status.
Page 70 Chapter 3:Cluster Software Installation and Configuration clu- admin cluadmin Com- Subcom- mand mand Description Example restore Restores the cluster configuration cluster restore database from the backup copy in the /etc/cluster.conf.bak file. See Section 8.5, Backing Up and Restoring the Cluster Database for information.
Page 71: Deleting A Service
Section 3.4:Using the cluadmin Utility clu- admin cluadmin Com- Subcom- mand mand Description Example relocate Causes a service to be stopped on the service relocate cluster member its currently running nfs1 on and restarted on the other. Refer to Section 4.6, Relocating a Service for more information.
Page 72 Chapter 3:Cluster Software Installation and Configuration While using the cluadmin utility, press the key to help identify cluadmin commands. For [Tab] example, pressing the key at the cluadmin> prompt displays a list of all the commands. En- [Tab] tering a letter at the prompt and then pressing the key displays the commands that begin with [Tab] the specified letter.
Page 73: Chapter 4 Service Configuration And Administration
Section 4.1:Configuring a Service 4 Service Configuration and Administration The following sections describe how to configure, display, enable/disable, modify, relocate, and delete a service, as well as how to handle services which fail to start. 4.1 Configuring a Service The cluster systems must be prepared before any attempts to configure a service. For example, set up disk storage or applications used in the services.
Page 74: Service Property And Resource
Chapter 4:Service Configuration and Administration • Section 5.4, Setting Up a DB2 Service • Section 6.1, Setting Up an NFS Service • Section 6.2, Setting Up a High Availability Samba Service • Section 7.1, Setting Up an Apache Service 4.1.1 Gathering Service Information Before creating a service, gather all available information about the service resources and properties.
Page 75 Section 4.1:Configuring a Service Service Property Resource Description IP address One or more Internet protocol (IP) addresses may be assigned to a service. This IP address (sometimes called a "floating" IP address) is different from the IP address associated with the host name Ethernet interface for a cluster system, because it is automatically relocated along with the service resources, when failover occurs.
Page 76 Chapter 4:Service Configuration and Administration Service Property Resource Description Service Specifies the frequency (in seconds) that the system will check the health of Check the application associated with the service. For example, it will verify that the Interval necessary NFS or Samba daemons are running. For additional service types, the monitoring consists of examining the return status when calling the "status"...
Page 77: Displaying A Service Configuration
Section 4.2:Displaying a Service Configuration The /usr/share/cluster/doc/services/examples directory contains a template that can be used to create service scripts, in addition to examples of scripts. See Section 5.1, Setting Up an Oracle Service, Section 5.3, Setting Up a MySQL Service, Section 7.1, Setting Up an Apache Service, and Section 5.4, Setting Up a DB2 Service for sample scripts.
Page 78 Chapter 4:Service Configuration and Administration • Whether the service was disabled after it was added • Preferred member system • Whether the service will relocate to its preferred member when it joins the cluster • Service Monitoring interval • Service start script location IP addresses •...
Page 79: Disabling A Service
Section 4.4:Enabling a Service NFS export 0: /mnt/users/engineering/brown Client 0: brown, rw cluadmin> If the name of the service is known, it can be specified with the service show config ser- vice_name command. 4.3 Disabling a Service A running service can be disabled in order to stop the service and make it unavailable. Once disabled, a service can then be re-enabled.
Page 80: Modifying A Service
Chapter 4:Service Configuration and Administration 4.5 Modifying a Service All properties that were specified when a service was created can be modified. For example, specified IP addresses can be changed. More resources can also be added to a service (for example, more file systems).
Page 81: Deleting A Service
Section 4.8:Handling Services that Fail to Start 4.7 Deleting a Service A cluster service can be deleted. Note that the cluster database should be backed up before deleting a service. See Section 8.5, Backing Up and Restoring the Cluster Database for information. To delete a service by using the cluadmin utility, follow these steps: 1.
Page 82 Chapter 4:Service Configuration and Administration 2. Use the cluadmin utility to attempt to enable or disable the service on the cluster system that owns the service. See Section 4.3, Disabling a Service and Section 4.4, Enabling a Service for more information. 3.
Page 83: Chapter 5 Database Services
Section 5.1:Setting Up an Oracle Service 5 Database Services This chapter contains instructions for configuring Red Hat Linux Advanced Server to make database services highly available. Note The following descriptions present example database configuration instruc- tions. Be aware that differences may exist in newer versions of each database product.
Page 84 Chapter 5:Database Services start and stop a Web application that has been written using Perl scripts and modules and is used to interact with the Oracle database. Note that there are many ways for an application to interact with an Oracle database.
Page 85 Section 5.1:Setting Up an Oracle Service # ORACLE_SID # Specifies the Oracle system identifier or "sid", which is the name of # the Oracle Server instance. ######################################################################## export ORACLE_SID=TESTDB ######################################################################## # ORACLE_BASE # Specifies the directory at the top of the Oracle software product and # administrative file structure.
Page 86 Chapter 5:Database Services # Verify that the users search path includes $ORCLE_HOME/bin ######################################################################## export PATH=$PATH:/u01/app/oracle/product/${ORACLE_RELEASE}/bin ######################################################################## # This does the actual work. # The oracle server manager is used to start the Oracle Server instance # based on the initSID.ora initialization parameters file specified. ######################################################################## /u01/app/oracle/product/${ORACLE_RELEASE}/bin/svrmgrl <<...
Page 87 Section 5.1:Setting Up an Oracle Service # Specifies the Oracle system identifier or "sid", which is the name # of the Oracle Server instance. ###################################################################### export ORACLE_SID=TESTDB ###################################################################### # ORACLE_BASE # Specifies the directory at the top of the Oracle software product # and administrative file structure.
Page 88 Chapter 5:Database Services ###################################################################### export PATH=$PATH:/u01/app/oracle/product/${ORACLE_RELEASE}/bin ###################################################################### # This does the actual work. # The oracle server manager is used to STOP the Oracle Server instance # in a tidy fashion. ###################################################################### /u01/app/oracle/product/${ORACLE_RELEASE}/bin/svrmgrl << EOF spool /home/oracle/stopdb.log connect internal; shutdown abort; spool off exit 0 The following is an example of the startdbi script, which is used to start a networking DBI proxy...
Page 89 Section 5.1:Setting Up an Oracle Service # This line does the real work. /usr/bin/dbiproxy --logfile /home/oracle/dbiproxy.log --localport 1100 & exit 0 The following is an example of the stopdbi script, which is used to stop a networking DBI proxy daemon: #!/bin/sh ################################################################### # Our Web Server application (perl scripts) work in a distributed...
Page 90 Chapter 5:Database Services c - Cancel and return to the top-level cluadmin command r - Restart to the initial prompt while keeping previous responses p - Proceed with the next prompt Preferred member [None]: ministor0 Relocate when the preferred member joins the cluster (yes/no/?) \ [no]: yes User script (e.g., /usr/foo/script or None) \ [None]: /home/oracle/oracle...
Page 91: Tuning Oracle Services
Section 5.2:Tuning Oracle Services Disable service (yes/no/?) [no]: no name: oracle disabled: no preferred node: ministor0 relocate: yes user script: /home/oracle/oracle IP address 0: 10.1.16.132 netmask 0: 255.255.255.0 broadcast 0: 10.1.16.255 device 0: /dev/sda1 mount point, device 0: /u01 mount fstype, device 0: ext2 force unmount, device 0: yes device 1: /dev/sda2 mount point, device 1: /u02...
Page 92: Setting Up A Mysql Service
Chapter 5:Database Services in the cluster environment. This will ensure that failover is transparent to database client application programs and does not require programs to reconnect. 5.3 Setting Up a MySQL Service A database service can serve highly-available data to a MySQL database application. The application can then provide network access to database client systems, such as Web servers.
Page 93 # based systems) and linked to /etc/rc3.d/S99mysql. When this is done # the mysql server will be started when the machine is started. # Comments to support chkconfig on RedHat Linux # chkconfig: 2345 90 90 # description: A very fast and reliable SQL database engine.
Page 94 Chapter 5:Database Services else if test -d "$datadir" then pid_file=$datadir/‘hostname‘.pid if grep "^basedir" $conf > /dev/null then basedir=‘grep "^basedir" $conf | cut -f 2 -d= | tr -d ’ ’‘ bindir=$basedir/bin if grep "^bindir" $conf > /dev/null then bindir=‘grep "^bindir" $conf | cut -f 2 -d=| tr -d ’ ’‘ # Safeguard (relative paths, core dumps..) cd $basedir case "$mode"...
Page 95 Section 5.3:Setting Up a MySQL Service echo "No mysqld pid file found. Looked for $pid_file." # usage echo "usage: $0 start|stop" exit 1 esac The following example shows how to use cluadmin to add a MySQL service. cluadmin> service add The user interface will prompt you for information about the service.
Page 96: Setting Up A Db2 Service
Chapter 5:Database Services Broadcast (e.g. X.Y.Z.255 or None) [None]: [Return] Do you want to (a)dd, (m)odify, (d)elete or (s)how an IP address, or are you (f)inished adding IP addresses: f Do you want to add a disk device to the service (yes/no/?): yes Disk Device Information Device special file (e.g., /dev/sda1): /dev/sda1 Filesystem type (e.g., ext2, reiserfs, ext3 or None): ext2...
Page 97 Section 5.4:Setting Up a DB2 Service 1. On both cluster systems, log in as root and add the IP address and host name that will be used to access the DB2 service to /etc/hosts file. For example: 10.1.16.182 ibmdb2.class.cluster.com ibmdb2 2.
Page 98 Chapter 5:Database Services ADMIN.HOME_DIRECTORY = /db2home/db2as ---------Administration Server Profile Registry Settings- --------------------------------------------------------- ADMIN.DB2COMM = TCPIP ---------Global Profile Registry Settings------------- ------------------------------------------------------ DB2SYSTEM = ibmdb2 7. Start the installation. For example: devel0# cd /mnt/cdrom/IBM/DB2 devel0# ./db2setup -d -r /root/db2server.rsp 1>/dev/null \ 2>/dev/null & 8.
Page 99 Section 5.4:Setting Up a DB2 Service 2>/dev/null & 14. Check for errors during the installation by examining the installation log file. Every step in the installation must be marked as except for the following: SUCCESS DB2 Instance Creation FAILURE Update DBM configuration file for TCP/IP CANCEL Update parameter DB2COMM CANCEL...
Page 100 Chapter 5:Database Services esac 17. Modify the /usr/IBMdb2/V6.1/instance/db2ishut file on both cluster systems to forcefully disconnect active applications before stopping the database. For example: for DB2INST in ${DB2INSTLIST?}; do echo "Stopping DB2 Instance "${DB2INST?}"..." >> ${LOGFILE?} find_homedir ${DB2INST?} INSTHOME="${USERHOME?}" su ${DB2INST?} -c " \ source ${INSTHOME?}/sqllib/db2cshrc 1>...
Page 101 Section 5.4:Setting Up a DB2 Service To test the database from the DB2 client system, invoke the following commands: # db2 connect to db2 user db2inst1 using ibmdb2 # db2 select tabname from syscat.tables # db2 connect reset...
Page 102 Chapter 5:Database Services...
Page 103: Chapter 6 Network File Sharing Services
Section 6.1:Setting Up an NFS Service 6 Network File Sharing Services This chapter contains instructions for configuring Red Hat Linux Advanced Server to make network file sharing services through NFS and Samba highly available. 6.1 Setting Up an NFS Service A highly available network filesystem (NFS) are one of the key strengths of the clustering infrastruc- ture.
Page 104 Chapter 6:Network File Sharing Services NFS services will not start unless the following NFS daemons are running: nfsd, rpc.mountd, and rpc.statd. • Filesystem mounts and their associated exports for clustered NFS services should not be included in /etc/fstab or /etc/exports. Rather, for clustered NFS services, the parameters de- scribing mounts and exports are entered via the cluadmin configuration utility.
Page 105 Section 6.1:Setting Up an NFS Service – Mount options — The mount information also designates the mount options. Note: by default, the Linux NFS server does not guarantee that all write operations are synchronously written to disk. In order to ensure synchronous writes, specify the sync mount option. Specifying the sync mount option favors data integrity at the expense of performance.
Page 106 Chapter 6:Network File Sharing Services The following are the service configuration parameters which will be used as well as some descriptive commentary. Note Prior to configuring an NFS service using cluadmin, it is required that the cluster daemons are running. •...
Page 107 Section 6.1:Setting Up an NFS Service Service name: nfs_accounting Preferred member [None]: clu4 Relocate when the preferred member joins the cluster (yes/no/?) \ [no]: yes Status check interval [0]: 30 User script (e.g., /usr/foo/script or None) [None]: Do you want to add an IP address to the service (yes/no/?) [no]: yes IP Address Information IP address: 10.0.0.10 Netmask (e.g.
Page 108 Chapter 6:Network File Sharing Services are you (f)inished adding CLIENTS [f]: a Export client name [*]: dwalsh Export client options [None]: rw Do you want to (a)dd, (m)odify, (d)elete or (s)how NFS CLIENTS, or are you (f)inished adding CLIENTS [f]: f Do you want to (a)dd, (m)odify, (d)elete or (s)how NFS EXPORTS, or are you (f)inished adding EXPORTS [f]: Do you want to (a)dd, (m)odify, (d)elete or (s)how DEVICES,...
Page 109 Section 6.1:Setting Up an NFS Service 6.1.5 Active-Active NFS Configuration In the previous section, an example configuration of a simple NFS service was discussed. This section describes how to setup a more complex NFS service. The example in this section involves configuring a pair of highly available NFS services. In this ex- ample, suppose two separate teams of users will be accessing NFS filesystems served by the cluster.
Page 110 Chapter 6:Network File Sharing Services Service name: nfs_engineering Preferred member [None]: clu3 Relocate when the preferred member joins the cluster (yes/no/?) [no]: yes Status check interval [0]: 30 User script (e.g., /usr/foo/script or None) [None]: Do you want to add an IP address to the service (yes/no/?) [no]: yes IP Address Information IP address: 10.0.0.11 Netmask (e.g.
Page 111 Section 6.1:Setting Up an NFS Service Do you want to (a)dd, (m)odify, (d)elete or (s)how NFS CLIENTS, or are you (f)inished adding CLIENTS [f]: Do you want to (a)dd, (m)odify, (d)elete or (s)how NFS EXPORTS, or are you (f)inished adding EXPORTS [f]: a Export directory name: /mnt/users/engineering/brown Authorized NFS clients Export client name [*]: brown...
Page 112: Setting Up A High Availability Samba Service
Chapter 6:Network File Sharing Services Avoid using exportfs -r File systems being NFS exported by cluster members do not get specified in the conventional /etc/exports file. Rather, the NFS exports associated with cluster services are specified in the cluster configuration file (as established by cluadmin). The command exportfs -r removes any exports which are not explicitly specified in the /etc/exports file.
Page 113 Section 6.2:Setting Up a High Availability Samba Service Note A complete explanation of Samba configuration is beyond the scope of this document. Rather, this documentation highlights aspects which are crucial for clustered operation. Refer to The Official Red Hat Linux Customization Guide for more details on Samba configuration.
Page 114 Chapter 6:Network File Sharing Services the specified Windows clients. It also designates access permissions and other mapping capabilities. In the single system model, a single instance of each of the smbd and nmbd daemons are automatically started up by the /etc/rc.d/init.d/smb runlevel script. In order to implement high availibility Samba services, rather than having a single /etc/samba/smb.conf file;...
Page 115 Section 6.2:Setting Up a High Availability Samba Service 6.2.3 Gathering Samba Service Configuration Parameters When preparing to configure Samba services, determine configuration information such as which filesystems will be presented as shares to Windows based clients. The following information is re- quired in order to configure NFS services: •...
Page 116 Chapter 6:Network File Sharing Services – Forced unmount — As part of the mount information, you will be prompted as to whether forced unmount should be enabled or not. When forced unmount is enabled, if any applications running on the cluster server have the designated filesystem mounted when the service is being disabled or relocated, then that application will be killed off to allow the unmount to proceed.
Page 117 Section 6.2:Setting Up a High Availability Samba Service 6.2.4 Example Samba Service Configuration In order to illustrate the configuration process for a Samba service, an example configuration is de- scribed in this section. This example consists of setting up a single Samba share which houses the home directories of four members of the accounting team.
Page 118 Chapter 6:Network File Sharing Services Service name: samba_acct Preferred member [None]: clu4 Relocate when the preferred member joins the cluster (yes/no/?) [no]: yes User script (e.g., /usr/foo/script or None) [None]: Status check interval [0]: 90 Do you want to add an IP address to the service (yes/no/?) [no]: yes IP Address Information IP address: 10.0.0.10 Netmask (e.g.
Page 119 Section 6.2:Setting Up a High Availability Samba Service relocate: yes user script: None monitor interval: 90 IP address 0: 10.0.0.10 netmask 0: None broadcast 0: None device 0: /dev/sdb12 mount point, device 0: /mnt/users/accounting mount fstype, device 0: ext2 mount options, device 0: rw,nosuid,sync force unmount, device 0: yes samba share, device 0: acct Add samba_acct service as shown? (yes/no/?) yes...
Page 120 Chapter 6:Network File Sharing Services workgroup = RHCLUSTER lock directory = /var/cache/samba/acct log file = /var/log/samba/%m.log encrypt passwords = yes bind interfaces only = yes interfaces = 10.0.0.10 [acct] comment = High Availability Samba Service browsable = yes writable = no public = yes path = /mnt/service12 The following are descriptions of the most relevant fields, from a clustering perspective,...
Page 121: Etc/Hosts
Section 6.2:Setting Up a High Availability Samba Service writable By default, the share access permissions are conservatively set as non-writable. Tune this parameter according to your site-specific preferences. path Defaults to the first filesystem mount point specified within the service configuration. This should be adjusted to match the specific directory or subdirectory intended to be available as a share to Windows clients.
Page 122 Chapter 6:Network File Sharing Services measures to respond to the lack of immediate response from the Samba server. In the case of a planned service relocation or a true failover scenario, there is a period of time where the Windows clients will not get immediate response from the Samba server.
Page 123: Chapter 7 Apache Services
Section 7.1:Setting Up an Apache Service 7 Apache Services This chapter contains instructions for configuring Red Hat Linux Advanced Server to make the Apache Web server highly available. 7.1 Setting Up an Apache Service This section provides an example of setting up a cluster service that will fail over an Apache Web server.
Page 124 Chapter 7:Apache Services 1. On a shared disk, use the interactive fdisk utility to create a partition that will be used for the Apache document root directory. Note that it is possible to create multiple document root directo- ries on different disk partitions. See Partitioning Disks in Section 2.4.4 for more information. 2.
Page 125 Section 7.1:Setting Up an Apache Service • If the script directory resides in a non-standard location, specify the directory that will contain the CGI programs. For example: ScriptAlias /cgi-bin/ "/mnt/apacheservice/cgi-bin/" • Specify the path that was used in the previous step, and set the access permissions to default to that directory.
Page 126 Chapter 7:Apache Services Before the Apache service is added to the cluster database, ensure that the Apache directories are not mounted. Then, on one cluster system, add the service. Specify an IP address, which the cluster infrastructure will bind to the network interface on the cluster system that runs the Apache service. The following is an example of using the cluadmin utility to add an Apache service.
Page 127 Section 7.1:Setting Up an Apache Service Do you want to (a)dd, (m)odify, (d)elete or (s)how devices, or are you (f)inished adding device information: f Disable service (yes/no/?) [no]: no name: apache disabled: no preferred node: node1 relocate: yes user script: /etc/rc.d/init/httpd IP address 0: 10.1.16.150 netmask 0: 255.255.255.0 broadcast 0: 10.1.16.255...
Page 128 Chapter 7:Apache Services...
Page 129: Chapter 8 Cluster Administration
Section 8.1:Displaying Cluster and Service Status 8 Cluster Administration The following chapter describes the various administrative tasks involved in maintaining a cluster after it has been installed and configured. 8.1 Displaying Cluster and Service Status Monitoring cluster and service status can help identify and resolve problems in the cluster environ- ment.
Page 130 Chapter 8:Cluster Administration Table 8–2 Power Switch Status Power Switch Status Description The power switch is operating properly. Could not obtain power switch status. A failure or error has occurred. Good The power switch is operating properly. Unknown The other cluster member is DOWN Timeout The power switch is not responding to power daemon commands,...
Page 131: Service Status
To display a snapshot of the current cluster status, invoke the clustat utility. For example: clustat Cluster Status Monitor (Fileserver Test Cluster) 07:46:05 Cluster alias: clu1alias.boston.redhat.com ===================== M e m b e r S t a t u s ======================= Member...
Page 132: Starting And Stopping The Cluster Software
Chapter 8:Cluster Administration clu2 Good =================== H e a r t b e a t S t a t u s =================== Name Type Status ------------------------------ ---------- ------------ clu1 <--> clu2 network ONLINE =================== S e r v i c e S t a t u s ======================= Last Monitor...
Page 133: Modifying The Cluster Configuration
Section 8.5:Backing Up and Restoring the Cluster Database When the system is able to rejoin the cluster, use the following command: /sbin/chkconfig --add cluster Then reboot the system or run the cluster start command located in the System V init direc- tory.
Page 134: Modifying Cluster Event Logging
Chapter 8:Cluster Administration 2. On the remaining cluster system, invoke the cluadmin utility and restore the cluster database. To restore the database from the /etc/cluster.conf.bak file, specify the cluster re- store command. To restore the database from a different file, specify the cluster re- storefrom file_name command.
Page 135: Updating The Cluster Software
Section 8.7:Updating the Cluster Software 8.7 Updating the Cluster Software Before upgrading Red Hat Cluster Manager, be sure to install all of the required software, as de- scribed in Section 2.3.1, Kernel Requirements. The cluster software can be updated while preserving the existing cluster database.
Page 136: Reloading The Cluster Database
Chapter 8:Cluster Administration cluconfig --init=/dev/raw/raw1 9. Start the cluster software on the second cluster system by invoking the cluster start com- mand located in the System V init directory. For example: /sbin/service cluster start 8.8 Reloading the Cluster Database Invoke the cluadmin utility and use the cluster reload command to force the cluster to re-read the cluster database.
Page 137: Disabling The Cluster Software
Section 8.12:Diagnosing and Correcting Problems in a Cluster /sbin/cluconfig --init=/dev/raw/raw1 6. Start the cluster daemons by invoking the cluster start command located in the System V init directory on both cluster systems. For example: /sbin/service cluster start 8.11 Disabling the Cluster Software It may become necessary to temporarily disable the cluster software on a member system.
Page 138 Chapter 8:Cluster Administration Table 8–5 Diagnosing and Correcting Problems in a Cluster Problem Symptom Solution SCSI bus not terminated SCSI errors appear in the Each SCSI bus must be log file terminated only at the beginning and end of the bus. Depending on the bus configuration, it might be necessary to enable or disable termination in host...
Page 139 Section 8.12:Diagnosing and Correcting Problems in a Cluster Problem Symptom Solution SCSI identification SCSI errors appear in the Each device on a SCSI bus must numbers not unique log file have a unique identification number. See Section A.5, SCSI Identification Numbers for more information.
Page 140 Chapter 8:Cluster Administration Problem Symptom Solution Mounted quorum partition Messages indicating Be sure that the quorum checksum errors on a partition raw devices are quorum partition appear in used only for cluster state the log file information. They cannot be used for cluster services or for non-cluster purposes, and cannot contain a file system.
Page 141: Event Logging
Section 8.12:Diagnosing and Correcting Problems in a Cluster Problem Symptom Solution Quorum partitions not set Messages indicating that a Run the cludiskutil -t up correctly quorum partition cannot be command to check that the accessed appear in the log quorum partitions are accessi- file ble.
Page 142 Chapter 8:Cluster Administration Problem Symptom Solution Cluster service stop fails Messages indicating the Use the fuser and ps because a file system operation failed appear on commands to identify the cannot be unmounted the console or in the log file processes that are accessing the file system.
Page 143: Power Switch Status
Section 8.12:Diagnosing and Correcting Problems in a Cluster Problem Symptom Solution Loose cable connection to Power switch status is Check the serial cable connection. power switch Timeout Power switch serial port Power switch status Examine the current settings and incorrectly specified in the indicates a problem modify the cluster configuration cluster database...
Page 144 Chapter 8:Cluster Administration...
Page 145: Chapter 9 Configuring And Using The Red Hat Cluster Manager Gui
Section 9.1:Setting up the JRE 9 Configuring and using the Red Hat Cluster Manager GUI Red Hat Cluster Manager includes a graphical user interface (GUI) which allows an administrator to graphically monitor cluster status. The GUI does not allow configuration changes or management of the cluster, however.
Page 146: Configuring Cluster Monitoring Parameters
Chapter 9:Configuring and using the Red Hat Cluster Manager GUI 9.1.2 Setting up the Sun JRE If the cluster GUI is to be installed on a non-cluster member, it may be necessary to download and install the JRE. The JRE can be obtained from Sun’s java.sun.com site. For example, at the time of publication, the specific page is http://java.sun.com/j2se/1.3/jre/download-linux.html After downloading the JRE, run the downloaded program (for example, j2re-1_3_1_02-linux- i386-rpm.bin) and confirm the license agreement.
Page 147: Enabling The Web Server
Section 9.4:Starting the Red Hat Cluster Manager GUI Do you wish to enable monitoring, both locally and remotely, via \ the Cluster GUI? yes/no [yes]: Answering no disables Cluster GUI access completely. 9.3 Enabling the Web Server In order to enable usage of the Cluster Manager GUI, all cluster members must be running a web server.
Page 148 Chapter 9:Configuring and using the Red Hat Cluster Manager GUI Figure 9–1 Red Hat Cluster Manager GUI Splashscreen By double-clicking on the cluster name within the tree view, the right side of the GUI will then fill with cluster statistics, as shown in Figure 9–2, Red Hat Cluster Manager GUI Main Screen. These statistics depict the status of the cluster members, the services running on each member, and the heart- beat channel status.
Page 149: Red Hat Cluster Manager Gui Main Screen
Section 9.4:Starting the Red Hat Cluster Manager GUI Figure 9–2 Red Hat Cluster Manager GUI Main Screen By default, the cluster statistics will be refreshed every 5 seconds. Clicking the right mouse button on the cluster name within the tree view will load a dialog allowing modification of the default update interval.
Page 150 Chapter 9:Configuring and using the Red Hat Cluster Manager GUI Figure 9–3 Red Hat Cluster Manager GUI Configuration Details Screen In Figure 9–3, Red Hat Cluster Manager GUI Configuration Details Screen, notice that the detailed device information appears after clicking on the individual device parameters. In addition to obtaining detailed configuration information related to cluster services, it is also possible to view the configuration of individual cluster members and heartbeat channels by double-clicking within the relevant section of the GUI.
Page 151: Appendix A Supplementary Hardware Information
Section A.1:Setting Up Power Switches A Supplementary Hardware Information The information in the following sections can help you set up a cluster hardware configuration. In some cases, the information is vendor specific. A.1 Setting Up Power Switches A.1.1 Setting up RPS-10 Power Switches If an RPS-10 Series power switch is used as a part of a cluster, be sure of the following: •...
Page 152 Appendix A:Supplementary Hardware Information Figure A–1 RPS-10 Power Switch Hardware Configuration See the RPS-10 documentation supplied by the vendor for additional installation information. Note that the information provided in this document supersedes the vendor information. A.1.2 Setting up WTI NPS Power Switches The WTI NPS-115 and NPS-230 power switch is a network attached device.
Page 153 Section A.1:Setting Up Power Switches • Assign system names to the Plug Parameters, (for example, clu1 to plug 1, clu2 to plug 2 — assuming these are the cluster member names). When running cluconfig to specify power switch parameters: • Specify a switch type of WTI_NPS •...
Page 154 Appendix A:Supplementary Hardware Information A.1.3 Setting up Baytech Power Switches The following information pertains to the RPC-3 and PRC-5 power switches. The Baytech power switch is a network attached device. Essentially, it is a power strip with network connectivity enabling power cycling of individual outlets. Only 1 Baytech switch is needed within the cluster (unlike the RPS-10 model where a separate switch per cluster member is required).
Page 155: Cluster Hardware
Section A.1:Setting Up Power Switches • When prompted for the plug/port number, specify the same name as assigned in Step 4 in prior section. The following is an example screen output from configuring the Baytech switch which shows that the outlets have been named according to the example cluster names clu1 and clu2.
Page 156 Appendix A:Supplementary Hardware Information Configuring the Software Watchdog Timer Any cluster system can utilize the software watchdog timer as a data integrity provision, as no dedi- cated hardware components are required. If you have specified a power switch type of SW_WATCHDOG while using the cluconfig utility, the cluster software will automatically load the corresponding load- able kernel module called softdog.
Page 157 Section A.1:Setting Up Power Switches Note There may be other server types that support NMI watchdog timers aside from ones with Intel-based SMP system boards. Unfortunately, there is no simple way to test for this functionality other than simple trial and error. The NMI watchdog is enabled on supported systems by adding nmi_watchdog=1 to the kernel’s com- mand line.
Page 158 Appendix A:Supplementary Hardware Information In order to determine if the server supports the NMI watchdog timer, first try adding "nmi_watch- dog=1" to the kernel command line as described above. After the system has booted, log in as root and type: cat /proc/interrupts The output should appear similar to the following: CPU0...
Page 159: Hardware Watchdog Timers
Section A.1:Setting Up Power Switches Table A–2 Hardware Watchdog Timers Card/Timer Driver Acquire SBC acquirewdt Advantech SBC advantechwdt Intel-810 based TCO WDT i810-tco Eurotech CPU-1220/1410 WDT eurotech IB700 WDT ib700 60xx SBC WDT sbc60xxwdt W83877F WDT w83877f Netwinder W83977AF wdt977 Industrial Computer WDT500 Industrial Computer WDT501 Industrial Computer WDT500PCI...
Page 160: Scsi Bus Configuration Requirements
Appendix A:Supplementary Hardware Information Note It has been observed that the Master Switch may become unresponsive when placed on networks which have high occurrences of broadcast or multi-cast packets. In these cases, isolate the power switch to a private subnet. •...
Page 161: Scsi Bus Termination
Section A.3:SCSI Bus Termination • Buses must be terminated at each end. See Section A.3, SCSI Bus Termination for more informa- tion. • Buses must not extend beyond the maximum length restriction for the bus type. Internal cabling must be included in the length of the SCSI bus. See Section A.4, SCSI Bus Length for more information.
Page 162: Scsi Bus Length
Appendix A:Supplementary Hardware Information • To disconnect a host bus adapter from a single-initiator bus, you must disconnect the SCSI cable first from the RAID controller and then from the adapter. This ensures that the RAID controller is not exposed to any erroneous input. •...
Page 163: Host Bus Adapter Features And Configuration Requirements
Section A.6:Host Bus Adapter Features and Configuration Requirements The previous order specifies that 7 is the highest priority, and 8 is the lowest priority. The default SCSI identification number for a host bus adapter is 7, because adapters are usually assigned the highest priority.
Page 164 Appendix A:Supplementary Hardware Information Table A–3 Host Bus Adapter Features and Configuration Requirements Single-Initiator Host Bus Adapter Features Configuration Adaptec 2940U2W Ultra2, wide, LVD. Set the onboard termination to automatic HD68 external connector. (the default). One channel, with two Use the internal SCSI bus segments.
Page 165 Section A.6:Host Bus Adapter Features and Configuration Requirements Single-Initiator Host Bus Adapter Features Configuration Tekram DC-390U2W Ultra2, wide, LVD Use the internal SCSI connector for private HD68 external connector (non-cluster) storage. One channel, two segments Onboard termination for a bus segment is disabled if internal and external cables are connected to the segment.
Page 166 Appendix A:Supplementary Hardware Information Single-Initiator Host Bus Adapter Features Configuration Adaptec 29160LP Ultra160 Set the onboard termination to automatic VHDCI external (the default). connector Use the internal SCSI One channel connector for private Set the onboard (non-cluster) storage. termination by using the BIOS utility.
Page 167 Section A.6:Host Bus Adapter Features and Configuration Requirements Single-Initiator Host Bus Adapter Features Configuration LSI Logic SYM22915 Ultra160 Set onboard termination to automatic (the Two VHDCI external default). connectors Use the internal SCSI Two channels connectors for private Set the onboard (non-cluster) storage.
Page 168: Tuning The Failover Interval
Appendix A:Supplementary Hardware Information Table A–4 QLA2200 Features and Configuration Requirements Host Bus Single-Initiator Multi-Initiator Adapter Features Configuration Configuration QLA2200 Fibre Channel Can be implemented Can be (minimum driver: arbitrated loop and with point-to-point implemented QLA2x00 V2.23 fabric links from the adapter with FC hubs to a multi-ported or switches...
Page 169 Section A.7:Tuning the Failover Interval Name Default (sec.) Description sameTimeNetdown The number of intervals that must elapse before concluding a cluster member has failed when the cluhbd heartbeat daemon is unable to communicate with the other cluster member sameTimeNetup The number of intervals that must elapse before concluding a cluster member to have failed, when the cluhbd heartbeat daemon is able to communicate with the other cluster member.
Page 170 Appendix A:Supplementary Hardware Information...
Page 171: Appendix B Supplementary Software Information
Section B.1:Cluster Communication Mechanisms B Supplementary Software Information The information in the following sections can assist in the management of the cluster software con- figuration. B.1 Cluster Communication Mechanisms A cluster uses several intra-cluster communication mechanisms to ensure data integrity and correct cluster behavior when a failure occurs.
Page 172: Cluster Daemons
Appendix B:Supplementary Software Information The complete failure of the heartbeat communication mechanism does not automatically result in a failover. If a cluster system determines that the quorum timestamp from the other cluster system is not up-to- date, it will check the heartbeat status. If heartbeats to the system are still operating, the cluster will take no action at this time.
Page 173: Failover And Recovery Scenarios
Section B.3:Failover and Recovery Scenarios B.3 Failover and Recovery Scenarios Understanding cluster behavior when significant events occur can assist in the proper management of a cluster. Note that cluster behavior depends on whether power switches are employed in the con- figuration.
Page 174: Inaccessible Quorum Partitions
Appendix B:Supplementary Software Information B.3.2 System Panic A system panic (crash) is a controlled response to a software-detected error. A panic attempts to return the system to a consistent state by shutting down the system. If a cluster system panics, the following occurs: 1.
Page 175: Quorum Daemon Failure
Section B.3:Failover and Recovery Scenarios • All the heartbeat network cables are disconnected from a system. • All the serial connections and network interfaces used for heartbeat communication fail. If a total network connection failure occurs, both systems detect the problem, but they also detect that the SCSI disk connections are still active.
Page 176: Heartbeat Daemon Failure
Appendix B:Supplementary Software Information If a quorum daemon fails, and power switches are used in the cluster, the following occurs: 1. The functional cluster system detects that the cluster system whose quorum daemon has failed is not updating its timestamp on the quorum partitions, although the system is still communicating over the heartbeat channels.
Page 177: Cluster Database Fields
Section B.4:Cluster Database Fields /sbin/service cluster stop Then, to restart the cluster software, perform the following: /sbin/service cluster start B.3.10 Monitoring Daemon Failure If the cluster monitoring daemon (clumibd) fails, it is not possible to use the cluster GUI to monitor status.
Page 178 Appendix B:Supplementary Software Information id = id name = system_name Specifies the identification number (either 0 or 1) for the cluster system and the name that is returned by the hostname command (for example, storage0). powerSerialPort = serial_port Specifies the device special file for the serial port to which the power switches are connected, if any (for example, /dev/ttyS0).
Page 179: Using Red Hat Cluster Manager With Piranha
Section B.5:Using Red Hat Cluster Manager with Piranha start device0 name = device_file Specifies the special device file, if any, that is used in the service (for example, /dev/sda1). Note that it is possible to specify multiple device files for a service. start mount name = mount_point fstype = file_system_type...
Page 180 Appendix B:Supplementary Software Information Figure B–1 Cluster in an LVS Environment In a Piranha configuration, client systems issue requests on the World Wide Web. For security reasons, these requests enter a Web site through a firewall, which can be a Linux system serving in that capac- ity or a dedicated firewall device.
Page 181 Section B.5:Using Red Hat Cluster Manager with Piranha For example, the figure could represent an e-commerce site used for online merchandise ordering through a URL. Client requests to the URL pass through the firewall to the active Piranha load-bal- ancing system, which then forwards the requests to one of the three Web servers. The Red Hat Cluster Manager systems serve dynamic data to the Web servers, which forward the data to the requesting client system.
Page 182 Appendix B:Supplementary Software Information...
Page 183 Index Index changing the cluster name....136 diagnosing and correcting problems in a cluster ....... 137 disabling the cluster software ... 137 active-active configuration ....7 displaying cluster and service status..129 Apache modifying cluster event logging ..134 httpd.conf..
Page 184 Index severity levels ......134 NFS service, setting up ....103 cluster features Oracle service, setting up....83 administration user interface ....9 Oracle, tuning ......91 application monitoring...... 9 relocating a service ....... 80 data integrity assurance.
Page 185 Index ( See cluster daemons ) examples cluconfig ........58 database service ........ 9 databases minimum cluster configuration ..27 NFS service configuration....105 setting up service....... 96 no-single-point-of-failure configuration 28 MySQL oracle script ..
Page 186 Index Red Hat Cluster Manager GUI hardware watchdog timers ....158 splashscreen ......147 hardware watchdog timers table ... 159 file services heartbeat ........7 heartbeat channel status table ....130 active-active configuration ... 109 heartbeat channels caveats ...
Page 187 Index requirements....... 34 client access ......108 Kernel Boot Timeout Limit server requirements ...... 103 decreasing ......... 36 service configuration example ..105 kernel requirements service configuration parameters ..104 Red Hat Linux..
Page 188 Index watchdog timers......17 and Piranha ....... 179 hardware-based......17 graphical user interface (GUI) ..145 Red Hat Cluster Manager GUI ... 145, 147 software-based......17 power switch hardware table....20 Java Runtime Environment (JRE ...
Page 189 Index services ........73 network hardware ......24 ( See also cluster services ) no-single-point-of-failure configuration 28 setting up RPS-10 power switches table .. 151 point-to-point Ethernet heartbeat channel shared disk storage hardware ......25 configuring ..
Page 190 Index UPS systems configuring ........ 42 watchdog timers hardware configuring......158 hardware-based......17 enabling ........ 156 setting up ......... 155 software ........156 configuration ......156 software-based ..

Red Hat CLUSTER MANAGER - INSTALLATION AND Administration Manual

Chapter 1 Introduction to Red hat Cluster Manager

Chapter 2 Hardware Installation and Operating System Configuration

Chapter 3 Cluster Software Installation and Configuration

Chapter 4 Service Configuration and Administration

Chapter 5 Database Services

Chapter 6 Network File Sharing Services

Chapter 7 Apache Services

Chapter 8 Cluster Administration

Chapter 9 Configuring and Using the Red hat Cluster Manager GUI

Appendix A Supplementary Hardware Information

Appendix B Supplementary Software Information

Quick Links

Need help?

Questions and answers

Related Manuals for Red Hat CLUSTER MANAGER - INSTALLATION AND

Summary of Contents for Red Hat CLUSTER MANAGER - INSTALLATION AND

Table of Contents