Table of Contents

Advertisement

Quick Links

®
CHALLENGE
RAID
Owner's Guide
Document Number 007-2532-004

Advertisement

Table of Contents
loading

Summary of Contents for Silicon Graphics CHALLENGE RAID

  • Page 1 ® CHALLENGE RAID Owner’s Guide Document Number 007-2532-004...
  • Page 2 Mountain View, CA 94043-1389. Silicon Graphics, CHALLENGE, and IRIS are registered trademarks and IRIX, XFS, IRIS FailSafe, POWER CHALLENGE, and POWER Channel are trademarks of Silicon Graphics, Inc. Oracle Parallel Server and OPS are trademarks of Oracle Corporation. ® CHALLENGE RAID Owner’s Guide...
  • Page 3: Table Of Contents

    Contents List of Figures vii List of Tables ix About This Guide xi Features of the CHALLENGE RAID Storage System 1 Storage System Components 5 SCSI-2 Interface 5 CHALLENGE RAID Storage-Control Processor Storage System Chassis 8 Disk Modules Data Availability and Performance 11...
  • Page 4 Getting General System Information 39 Getting Information About Disks 39 Getting Information About Other Components 42 Displaying the CHALLENGE RAID Unsolicited Event Log 43 Shutting Down the CHALLENGE RAID Storage System 44 Restarting the CHALLENGE RAID Storage System 45 Configuring Disks 49...
  • Page 5 Diagnosing SP Failure 81 Using the Auto-Reassign Capability 82 Caching 85 Setting Cache Parameters 86 Viewing Cache Statistics Upgrading CHALLENGE RAID to Support Caching 90 Changing Unit Caching Parameters 91 Technical Specifications 93 The raid5 Command Line Interface 97 bind 99...
  • Page 7: List Of Figures

    Dual-Bus/Dual-Initiator Configuration Example 33 Figure 2-3 CHALLENGE RAID Indicator Lights 36 Figure 3-1 Disk Module Locations 40 Figure 3-2 Turning Off (On) Power (Back of CHALLENGE RAID Figure 3-3 Chassis) 45 CHALLENGE RAID Indicator Lights 45 Figure 3-4 Unlocking the Fan Module 46 Figure 3-5 Enabling an SP’s Power 47...
  • Page 8 List of Figures Attaching the ESD Clip to the ESD Bracket on a Deskside Figure 5-3 Storage System 66 Attaching the ESD Clip to the ESD Bracket on a Rack Figure 5-4 Storage System 66 Pulling Out a Disk Module 67 Figure 5-5 Removing a Disk Module 68 Figure 5-6...
  • Page 9 Ordering Add-On Disk Module Sets 72 Table 5-2 Field-Replaceable Units 79 Table 6-1 Output of raid5 getcache 88 Table 7-1 CHALLENGE RAID Deskside Chassis Specifications 93 Table A-1 CHALLENGE RAID Rack Specifications 94 Table A-2 raid5 Parameters 97 Table B-1...
  • Page 11: About This Guide

    RAID rack can be connected to one or more SCSI buses on CHALLENGE servers separately or in combination. RAID levels 0, 1, 1_0 (0+1), and 5 are supported, as well as disks configured as hot spares. In addition, a basic CHALLENGE RAID storage system provides storage-system caching. Structure of This Guide This guide contains the following chapters: •...
  • Page 12 Chapter 7, “Caching,” explains how to determine, set up, and change caching parameters. • Appendix A, “Technical Specifications,” summarizes technical information for the CHALLENGE RAID deskside storage system. • Appendix B, “The raid5 Command Line Interface,” lists and explains all parameters of the raid5 command.
  • Page 13 Compliance Statements This section lists various domestic and international hardware compliance statements that pertain to the system. FCC Warning This equipment has been tested and found compliant with the limits for a Class A digital device, pursuant to Part 15 of the FCC rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment.
  • Page 14 About This Guide Attention Le present appareil numerique n’emet pas de bruits radioelectriques depassant les limites applicables aux appareils numeriques de Classe A prescrites dans le Reglement sur le Brouillage Radioelectrique etabli par le Ministere des Communications du Canada. Japanese Compliance Statement...
  • Page 15: Features Of The Challenge Raid Storage System

    University of California, Berkeley, in their 1987 paper, “A Case for Redundant Arrays of Inexpensive Disks (RAID)” (University of California, Berkeley, Report No. UCB/CSD/87/391). That paper defines various levels of RAID. This chapter introduces the CHALLENGE RAID disk-array storage system. It explains: • CHALLENGE RAID storage system components •...
  • Page 16: Figure 1-1 Challenge Raid Storage System, Deskside Version

    In Figure 1-1, the front cover is removed for clarity. Note: Figure 1-2 is an external view of the CHALLENGE RAID rack, with the maximum of four chassis assemblies installed. Each chassis assembly in a CHALLENGE RAID rack corresponds to one deskside CHALLENGE RAID...
  • Page 17: Figure 1-2 Challenge Raid Rack

    Figure 1-2 CHALLENGE RAID Rack...
  • Page 18 The CHALLENGE RAID storage system connects by a small computer system interface (SCSI-2) differential bus to a SCSI-2 interface in a CHALLENGE server.
  • Page 19: Storage System Components

    SCSI mezzanine card, the IO4B board, or both. Storage System Components The CHALLENGE RAID deskside storage system, or each chassis assembly in the CHALLENGE RAID rack, consists of these components: • one or more host SCSI-2 interfaces that are either –...
  • Page 20: Challenge Raid Storage-Control Processor

    SCSI-2 bus. An SP has five internal fast/narrow SCSI buses, each supporting four disk modules, for a total of 20 disk modules. Figure 1-4 diagrams a CHALLENGE RAID storage system with one SP. CHALLENGE server SCSI-2 bus...
  • Page 21: Figure 1-5 Sps Connected To The Same Challenge Chassis

    Storage System Components SCSI-2 SCSI-2 bus interface SP A SCSI-2 bus SCSI-2 interface SP B CHALLENGE RAID CHALLENGE server Figure 1-5 SPs Connected to the Same CHALLENGE Chassis SCSI-2 bus SCSI-2 SP A interface Second CHALLENGE server SCSI-2 bus SP B...
  • Page 22: Storage System Chassis

    Chapter 1: Features of the CHALLENGE RAID Storage System Storage System Chassis The CHALLENGE RAID storage system chassis contains compartments for disk modules, SPs, fan module, power supplies, and battery backup unit. The disk modules face front, and the SP(s), power supplies, battery backup unit, and fan module are accessible from the back.
  • Page 23: Figure 1-8 Scsi-2 Bus And Internal Buses (Front View)

    Storage System Components Through the SP, the SCSI-2 bus is split into five internal fast/narrow SCSI buses—A, B, C, D, and E—that connect the slots for the disk modules. For example, internal bus A connects the modules in slots A0, A1, A2, and A3, in that order.
  • Page 24: Disk Modules

    A disk module, also called a disk drive module, consists of a disk drive, a power regulator board, internal cabling, and a plastic carrier. The carrier has a handle for inserting and removing the module. Figure 1-9 indicates disk modules in the CHALLENGE RAID chassis and their status lights. Deskside Rack...
  • Page 25: Data Availability And Performance

    Data redundancy varies for the different RAID levels supported by CHALLENGE RAID: RAID-0, RAID-1, RAID-1_0, and RAID-5. Because the CHALLENGE RAID storage system has five internal SCSI-2 buses, RAID-5 provides redundancy for up to five groups of disk modules.
  • Page 26: Enhanced Performance: Disk Striping

    4 the stripe element size of 128 to yield a stripe size of 512 sectors. Enhanced Performance: Storage System Caching Caching is available for CHALLENGE RAID storage systems that have two SPs, each with at least 8 MB of memory, a battery backup unit, and disk modules in slots A0 through E0.
  • Page 27: Data Reconstruction And Rebuilding After Disk Module Failure

    The optional battery backup unit must be present in the Note: CHALLENGE RAID chassis for systems using cache to ensure that data is committed to disk in the event of a power failure. Data Reconstruction and Rebuilding After Disk Module...
  • Page 28: Raid Levels

    RAID-1_0 group: mirrored RAID-0 group • RAID-5 group: individual access array Caution: Use only CHALLENGE RAID disk modules to replace failed disk modules. CHALLENGE RAID disk modules contain proprietary firmware that the storage system requires for correct functioning. Using any other disks, including those from other Silicon Graphics systems, can cause failure of the storage system.
  • Page 29: Raid-1: Mirrored Pair

    RAID Levels RAID-1: Mirrored Pair In the RAID-1 configuration, two disk modules can be bound as a mirrored pair. In this disk configuration, the SP duplicates (mirrors) the data records and stores them separately on each disk module in the pair. The disks in a RAID-1 pair cannot be split into its individual units (as can a software mirror composed of two individual disk units).
  • Page 30: Raid-1_0 Group: Mirrored Raid-0 Group

    Chapter 1: Features of the CHALLENGE RAID Storage System RAID-1_0 Group: Mirrored RAID-0 Group A RAID-1_0 configuration mirrors a RAID-0 group, creating a primary RAID-0 image and a secondary RAID-0 image for user data. This arrangement consists of four, six, eight, ten, twelve, fourteen, or sixteen disk modules.
  • Page 31: Figure 1-11 Distribution Of User Data In A Raid-1_0 Group

    RAID Levels Stripe = User data First module of primary image (primary image) Blocks 0-127 384-511 768-895 1152-1279 1536-1663 Stripe element = User data size (secondary image) Second module of primary image Primary image 128-255 512-639 896-1023 1280-1407 1664-1791 Stripe size Third module of primary image 256-383 640-767...
  • Page 32: Raid-5: Individual Access Array

    This configuration usually consists of five disk modules (but can have three to sixteen) bound as a RAID-5 group. Because there are five internal SCSI-2 buses in the CHALLENGE RAID system, an array of five disk modules (or fewer) provides the greatest level of data redundancy.
  • Page 33: Figure 1-12 Distribution Of User And Parity Data In A Raid-5 Group

    Distribution of User and Parity Data in a RAID-5 Group Figure 1-12 For each write operation to a RAID-5 group, the CHALLENGE RAID storage system must perform the following steps: 1. Read data from the sectors being written and parity data for those sectors.
  • Page 34: Raid Hot Spare

    Chapter 1: Features of the CHALLENGE RAID Storage System RAID Hot Spare A hot spare is a dedicated replacement disk unit on which users cannot store information. The capacity of a disk module that you bind as a hot spare must be at least as great as the capacity of the largest disk module it might replace.
  • Page 35: Using The Challenge Raid Command Line Interface

    Using the CHALLENGE RAID Command Line Interface When you replace the failed module in D0, the SP starts copying the structure on A2 to D0. When it finishes, the RAID-5 group once again consists of modules A0-E0 and the hot spare becomes available for use if any other module fails.
  • Page 36 Chapter 1: Features of the CHALLENGE RAID Storage System • get information on all CRUs (customer-replaceable units) • display status information on disks • display the SP log • display information about a group of disks • perform housekeeping operations, such as clearing the error log or updating firmware...
  • Page 37: Storage System Configurations

    Chapter 2 Storage System Configurations This chapter explains the various CHALLENGE RAID configurations. Use it to plan your storage system or whenever you contemplate changes in your storage system or physical disk configuration. A CHALLENGE RAID storage system is configured on two levels: •...
  • Page 38: Table 2-1 Challenge Raid Configurations

    If one host, SCSI-2 adapter, or SP fails, the other host can take over the failed host’s disk units with system operator intervention. This configuration is required for Silicon Graphics FailSafe™ and Oracle Parallel Server™ (OPS™).
  • Page 39: Basic Configuration

    Basic Configuration Basic Configuration The basic configuration has one host with one SCSI-2 interface connected by a SCSI-2 bus to the SP in the storage system. The system can survive failure of a disk module within a redundant RAID group, but it cannot continue after failure of a SCSI-2 interface or SP. Table 2-2 lists the error recovery features of the basic configuration.
  • Page 40: Dual-Interface/Dual-Processor Configuration

    SP and transfer control of disk units to the replacement SP. Fan module Applications continue running. Silicon Graphics SSE or other authorized service provider replaces module. Power supply If redundant power supply module is present, applications continue running;...
  • Page 41: Figure 2-1 Dual-Interface/Dual-Processor Configuration Example

    When convenient, the Silicon Graphics SSE or other authorized service provider can replace the interface and the system operator can transfer control of disk units to the replacement SP.
  • Page 42: Split-Bus Configuration

    SP’s disk units to the working SP, shut down the host, power off and on the storage system, and reboot the host. Silicon Graphics SSE or other authorized service provider replaces the SP and transfers control of disk units to the replacement SP.
  • Page 43 SP’s disk units to the SP on the interface in the other host, shut down the other host, power off and on the storage system, and reboot the other host. Silicon Graphics SSE or other authorized service provider replaces the interface. SCSI-2 cable I/O operations fail to storage system disk units owned by the SP attached to the failed cable.
  • Page 44: Figure 2-2 Split-Bus Configuration Example

    If one SP fails or if the SCSI-2 connection from one host is broken, that host does not have access to the CHALLENGE RAID storage system until the SP is replaced or the SCSI-2 connection is repaired. The host using the remaining SCSI-2 connection and remaining operational SP still has full access to its own data.
  • Page 45: Dual-Bus/Dual-Initiator Configuration

    SP, shut down the host, power the storage system off and on, and reboot both hosts. When convenient, Silicon Graphics SSE or other authorized service provider replaces the SP, and the system operation can transfer control of the disk units to the replacement SP.
  • Page 46: Table 2-5 Error Recovery: Dual-Bus/Dual-Initiator

    When convenient, the Silicon Graphics SSE or other authorized service provider can replace the interface, and the system operator can transfer control of disk units to replacement SP.
  • Page 47: Figure 2-3 Dual-Bus/Dual-Initiator Configuration Example

    SCSI-2 bus SCSI-2 SCSI-2 interface 1 interface 1 SCSI-2 bus SCSI-2 SCSI-2 interface 2 interface 2 CHALLENGE RAID In Out In Out Host 2: Accounts on 6 disks bound SP A SP B CHALLENGE 1 as RAID-1_0 CHALLENGE 2 Host 1: Mirrored pair for user directories;...
  • Page 49: Operating The Storage System

    Chapter 3 Operating the Storage System This chapter describes how to run CHALLENGE RAID after you have configured it. The chapter explains: • checking storage system status • shutting down the CHALLENGE RAID storage system • restarting the CHALLENGE RAID storage system This chapter introduces the /usr/raid5/raid5 command (command line interface, or CLI).
  • Page 50: Checking Challenge Raid Storage System Status

    The amber service light comes on when • an SP is reseated • the CHALLENGE RAID is powered off and on • the battery backup unit has not finished recharging (if battery backup unit is present in the system) If the service light is lit, look for a disk-module fault light that is lit. Then you...
  • Page 51: Using The Raid5 Command

    The raid5 command sends storage management and configuration requests to an application programming interface (API) on the CHALLENGE server. For the raid5 command to function, the agent—an interpreter between the command line interface and the CHALLENGE RAID storage system—must be running. The synopsis of the raid5 command is...
  • Page 52: Getting Device Names With Getagent

    Chapter 3: Operating the Storage System Getting Device Names With getagent Use the getagent parameter with raid5 to display information on devices controlled by the API: raid5 getagent Following is a sample output for one device; normally, the output would give information on all devices.
  • Page 53: Getting General System Information

    Checking CHALLENGE RAID Storage System Status Table 3-1 (continued) Output of raid5 getagent Entry Meaning SP Memory Amount of DRAM present on the SP. Serial No 12-digit ASCII string that uniquely identifies this subsystem. Getting General System Information To get general system information, use...
  • Page 54: Figure 3-2 Disk Module Locations

    Chapter 3: Operating the Storage System In this command, has the format , where is the bus the diskposition disk is located on (a through e; be sure to use lower case) and is the device number (0 through 3). Figure 3-2 diagrams disk module locations. Deskside Chassis assembly in rack A0 B0 C0 D0...
  • Page 55: Table 3-2 Output Of Raid5 Getdisk

    Checking CHALLENGE RAID Storage System Status A0 Hard Write Errors: 0 A0 Soft Read Errors: 0 A0 Soft Write Errors: 0 A0 Read Retries: 0 A0 Write Retries: 0 A0 Remapped Sectors: 0 A0 Number of Reads: 1007602 A0 Number of Writes: 1152057 Table 3-2 interprets items in this output.
  • Page 56: Getting Information About Other Components

    Number of Writes Number of writes this disk has seen Getting Information About Other Components For state information on other components—field-replaceable units—in the CHALLENGE RAID storage system besides disk modules, use -d device raid5 getcrus A sample output of this command follows:...
  • Page 57: Displaying The Challenge Raid Unsolicited Event Log

    If the battery backup unit takes longer than an hour to charge, it shuts itself off and transitions to the “Faulted” state. Displaying the CHALLENGE RAID Unsolicited Event Log The storage-control processor maintains a log of event messages in processor memory.
  • Page 58: Shutting Down The Challenge Raid Storage System

    If necessary, disable it with raid5 setcache disable, as explained in Chapter 7. 2. Turn off the power switch on the back of the CHALLENGE RAID storage system, as shown in Figure 3-3. You do not need to disable the power for the SP(s).
  • Page 59: Restarting The Challenge Raid Storage System

    Turning Off (On) Power (Back of CHALLENGE RAID Chassis) Restarting the CHALLENGE RAID Storage System To start the CHALLENGE RAID storage system, follow these steps: 1. Turn on the storage system’s power; see Figure 3-3. The green power light on the front of the storage system turns on (see Figure 3-4) and the fans rotate.
  • Page 60: Figure 3-5 Unlocking The Fan Module

    Chapter 3: Operating the Storage System 2. If the busy light on none of the drive modules lights up, make sure that the power for each SP is enabled. Move the fan module’s latch to the UNLOCK position, as indicated in Figure 3-5. Deskside Rack Unlocking the Fan Module...
  • Page 61: Figure 3-6 Enabling An Sp's Power

    Restarting the CHALLENGE RAID Storage System 4. Move the SP’s power switch to the enable position, as shown in Figure 3-6. Deskside Rack SP A SP B SP B SP A Figure 3-6 Enabling an SP’s Power 5. Close the fan module by closing the fan module and moving the module’s latch to the LOCK position.
  • Page 63: Configuring Disks

    Chapter 4 Configuring Disks This chapter explains • binding disks into RAID units • getting disk group (LUN) information • changing LUN parameters The chapter concludes with information on dual processors, load balancing, and device names. Binding Disks Into RAID Units The physical disk unit number is also known as the logical unit number, or LUN.
  • Page 64 Chapter 4: Configuring Disks raid-type Choices are • r0: RAID-0 • r1: RAID-1 • r1_0: RAID-1_0 • r5: RAID-5 • hs: hot spare lun-number Logical unit number to assign the unit (a hexadecimal number between 0 and F). disk-names Indicates which physical disks to bind, in the format bd, where b is the physical bus name (a through e;...
  • Page 65 Binding Disks Into RAID Units The optional arguments are as follows: -r rebuild-time Maximum time in hours to rebuild a replacement disk. Default is 4 hours; legal values are any number greater than or equal to 0. A rebuild time of 2 hours rebuilds the disk more quickly but degrades response time slightly.
  • Page 66 Chapter 4: Configuring Disks Although bind returns immediate status for a RAID device, the bind Note: itself does not complete for 45 to 60 minutes, depending on system traffic. Use getlun to monitor the progress of the bind; getlun returns the percent bound.
  • Page 67: Getting Disk Group (Lun) Information

    Getting Disk Group (LUN) Information Getting Disk Group (LUN) Information To display information on a logical unit and the components in it, use the getlun parameter: raid5 getlun lun-number The following example displays information about LUN 3. raid5 -d sc4d2l0 getlun 3 Following is truncated output for a RAID-5 group of five disks.
  • Page 68: Table 4-1 Output Of Raid5 Getlun

    Chapter 4: Configuring Disks A0 Prct Idle: 100 A0 Prct Busy: 0 A0 Remapped Sectors: 0 A0 Read Retries: 50 A0 Write Retries: 0 B0 Enabled [etc.] C0 Enabled [etc.] D0 Enabled [etc.] E0 Enabled [etc.] Table 4-1 summarizes entries in the raid5 getlun output. Table 4-1 Output of raid5 getlun Entry...
  • Page 69 Getting Disk Group (LUN) Information Table 4-1 (continued) Output of raid5 getlun Entry Meaning Write Aside Size Smallest write-request size in blocks that can bypass the cache and go directly to the disk; set with chglun Default Owner YES if this SP is the default owner (not necessarily current owner) of this LUN, otherwise, NO Rebuild Time Amount of time in hours in which a rebuild should be performed.
  • Page 70: Changing Lun Parameters

    Chapter 4: Configuring Disks Table 4-1 (continued) Output of raid5 getlun Entry Meaning Diskname Read Retries Number of read retries that have occurred on this disk Diskname Write Retries Number of write retries that have occurred on this disk Changing LUN Parameters To change parameters for a logical unit, use raid5 -d device chglun -l lun [ -c cache-flags] [-d default-owner] [-r rebuild-time] [-i idle-thresh] [-t idle-delay-time] [-w write-aside]...
  • Page 71: Dual Interfaces, Load Balancing, And Device Names

    Dual Interfaces, Load Balancing, and Device Names -i idle-thresh Maximum number of I/Os that can be outstanding to a LUN and have the LUN still be considered idle. Used to determine cache flush start time. Legal values are any number greater than or equal to 0. -t idle-delay-time Amount of time in 100-ms intervals that a unit must be below idle-thresh to be considered idle.
  • Page 73: Maintaining Disk Modules

    Chapter 3 in this guide, you can replace the defective module and rebuild your data without powering off the CHALLENGE RAID storage system or interrupting user applications.
  • Page 74: Figure 5-1 Disk Module Status Lights

    Disk Module Status Lights Figure 5-1 2. Determine the failed module’s ID; use Figure 5-2. Caution: Use only CHALLENGE RAID disk modules to replace failed disk modules. Order them from the Silicon Graphics hotline: 1-800-800-4SGI (1-800-800-4744). CHALLENGE RAID disk modules contain proprietary firmware that the storage system requires for correct functioning.
  • Page 75: Figure 5-2 Disk Module Locations

    Identifying and Verifying a Failed Disk Module Deskside Chassis assembly in rack A0 B0 C0 D0 E0 A2 B2 C2 D2 A1 B1 C1 D1 E1 A3 B3 C3 D3 E3 5 to 20 disk modules in groups of 5 Figure 5-2 Disk Module Locations 3.
  • Page 76: Setting Up The Workplace For Replacing Or Installing Disk Modules

    Never replace more than one disk module at a time; use only correct disk modules available from Silicon Graphics, Inc. Setting Up the Workplace for Replacing or Installing Disk Modules...
  • Page 77: Replacing A Disk Module

    Caution: Use only CHALLENGE RAID disk modules as replacements; only they contain the correct device firmware. Other disk modules, even those from other Silicon Graphics equipment, will not work. Do not mix disk modules of different capacities within one array.
  • Page 78: Unbinding The Disk

    Chapter 5: Maintaining Disk Modules Unbinding the Disk When you change a physical disk configuration, you change the bound configuration of a physical disk unit. Physical disk unit configuration changes when you add or remove a disk module, or physically move one or more disk modules to different slots in the chassis.
  • Page 79: Removing A Failed Disk Module

    Replacing a Disk Module Removing a Failed Disk Module You can replace a failed disk module while the storage system is powered on. If necessary, you can also replace a disk module that has not failed, such as a module that has reported many “soft” errors. When replacing a module that has not failed, you must do so while the storage system is powered up so that the SP knows the module is being replaced.
  • Page 80: Figure 5-4 Attaching The Esd Clip To The Esd Bracket On A Rack

    Chapter 5: Maintaining Disk Modules Clip and wire ESD bracket of ESD band Figure 5-3 Attaching the ESD Clip to the ESD Bracket on a Deskside Storage System Figure 5-4 shows where to attach the clip on a rack storage system. ESD bracket Clip and wire of ESD band...
  • Page 81: Figure 5-5 Pulling Out A Disk Module

    Warning: When removing a disk module from an upper chassis assembly in a CHALLENGE RAID rack system, make sure that you adequately balance the weight of the disk module. 9. Supporting the disk module with your free hand, pull it all the way out...
  • Page 82: Figure 5-6 Removing A Disk Module

    Chapter 5: Maintaining Disk Modules Deskside Rack Removing a Disk Module Figure 5-6 Caution: When removed from the chassis, the disk modules are extremely sensitive to shock and vibration. Even a slight jar can severely damage them. 10. If the label on the side of the disk module does not show the ID number for the compartment from which you removed the drive, write it on the label;...
  • Page 83: Installing A Replacement Disk Module

    Replacing a Disk Module Installing a Replacement Disk Module To install the replacement disk module, follow these steps: 1. Touch the new disk module’s antistatic packaging to discharge it and the drive module. Remove the new disk module from its packaging. Caution: The disk module is extremely sensitive to shock and vibration.
  • Page 84: Figure 5-8 Engaging The Disk Module Guide

    Chapter 5: Maintaining Disk Modules Guide slot Disk module’s guide Guide slot Deskside Rack Disk module’s guide Figure 5-8 Engaging the Disk Module Guide 5. Insert the disk module, as shown in Figure 5-9. Make sure it is completely seated in the slot. Deskside Rack ESD wrist band...
  • Page 85: Updating The Disk Module Firmware

    Updating the Disk Module Firmware After replacing a failed unbound disk module (A0, B0, C0, or A3), update the firmware on the CHALLENGE RAID SP. Follow these steps: 1. Quiesce the bus, disabling all applications. Make sure that only the RAID agent is running.
  • Page 86: Installing An Add-On Disk Module Array

    • creating device nodes and binding the disks Ordering Add-On Disk Module Arrays Call the Silicon Graphics, Inc., hotline to order add-on disk module arrays: 1-800-800-4SGI (1-800-800-4744) Use Table 5-2 as a guide to ordering add-on disk module arrays. Table 5-2...
  • Page 87: Installing Add-On Disk Modules

    Caution: Use only CHALLENGE RAID disk modules as replacements; only they contain the correct device firmware. Other disk modules, even those from other Silicon Graphics equipment, will not work. Do not mix disk modules of different capacities within one array. Do not remove disk modules from bus 0 (slots A0, B0, C0, D0, and E0) for use in other disk module positions.
  • Page 88: Figure 5-10 Marking The Label For Disk Module A0

    Chapter 5: Maintaining Disk Modules 6. Touch the new disk module’s antistatic packaging to discharge it and the drive module. Remove the new disk module from its packaging. 7. On the label on the side of the disk module, write the ID number for the compartment into which the drive is going.
  • Page 89: Figure 5-11 Disk Drive Locations

    Installing an Add-On Disk Module Array Deskside Chassis assembly in rack A0 B0 C0 D0 E0 A2 B2 C2 D2 A1 B1 C1 D1 E1 A3 B3 C3 D3 E3 5 to 20 disk modules in groups of 5 Figure 5-11 Disk Drive Locations 8.
  • Page 90: Figure 5-13 Engaging The Disk Module Guide

    Chapter 5: Maintaining Disk Modules 9. Engage the disk module’s guide in the chassis guide slot, as shown in Figure 5-13. Guide slot Disk module’s guide Guide slot Deskside Rack Disk module’s guide Figure 5-13 Engaging the Disk Module Guide 10.
  • Page 91: Creating Device Nodes And Binding The Disks

    Installing an Add-On Disk Module Array 11. Repeat steps 4 through 10 until all add-on modules are installed. 12. When you are finished installing add-on modules, remove and store the ESD wrist band, if you are using one. Creating Device Nodes and Binding the Disks If you are adding disk arrays to a storage system that already has at least one LUN configured, the SPs must be made aware of the new disks.
  • Page 93: Identifying Failed System Components

    Chapter 5 provides instructions. Call the Silicon Graphics hotline to order a replacement module: 1-800-800-4SGI (1-800-800-4744) Table 6-1 lists Silicon Graphics marketing codes for replacement units for the CHALLENGE RAID storage system. Field-Replaceable Units Table 6-1...
  • Page 94: Power Supply

    Power Supply The CHALLENGE RAID storage system has two or three redundant power supplies, or VSCs (voltage semi-regulated converters): VSC A, VSC B, and, optionally, VSC C. If the storage system has three power supplies, it can recover from power supply component failures and provide uninterrupted service while the defective component is replaced.
  • Page 95: Battery Backup Unit

    If the fault light comes on or if the battery backup unit state is shown as “Faulted” in the raid5 getcrus command output, have the battery backup unit replaced as soon as possible by a Silicon Graphics System Service Engineer. Storage-Control Processor This section discusses •...
  • Page 96: Using The Auto-Reassign Capability

    Using the Auto-Reassign Capability If one of two SPs becomes physically unavailable to the system, that is, if an SP fails or becomes disconnected, the CHALLENGE RAID storage system can automatically reassign the LUNs it controls to the remaining SP.
  • Page 97: Figure 6-1 Unlocking The Fan Module

    Storage-Control Processor 1. On the back of the CHALLENGE RAID storage system, move the fan module’s latch to the UNLOCK position, as shown in Figure 6-1. Deskside Rack Figure 6-1 Unlocking the Fan Module 2. Swing open the fan module, as shown in Figure 6-2.
  • Page 98: Figure 6-3 Disabling An Sp's Power

    Chapter 6: Identifying Failed System Components 3. Disable the failed SP’s power by sliding the switch to the Disable position, as shown in Figure 6-3. Deskside Rack SP A SP B SP B SP A Figure 6-3 Disabling an SP’s Power Caution: Do not power off the SP under any circumstances other than to enable auto-reassign.
  • Page 99: Caching

    A0, B0, C0, D0, and E0 as a fast repository for cached data Caching cannot occur unless all these conditions are met. This chapter explains • setting cache parameters • viewing cache statistics • upgrading CHALLENGE RAID to support caching • changing cache unit parameters...
  • Page 100: Setting Cache Parameters

    Chapter 7: Caching Setting Cache Parameters The cache parameters you specify for the entire storage system are the cache size of 8 or 64 MB, depending on the amount of memory the SPs have, and the cache page size, as 2, 4, 8, or 16 KB. To set up caching, use the raid5 setcache command: raid5 -d device setcache enable | disable [-u usable] [-p page] [-l low] [-h high] In this syntax, variables mean:...
  • Page 101: Viewing Cache Statistics

    Viewing Cache Statistics You can change the cache size, the cache page size values, or the type of caching for any physical disk unit without affecting the information stored on it. Follow these steps: 1. Disable the cache: raid5 -d device setcache 0 2.
  • Page 102: Table 7-1 Output Of Raid5 Getcache

    Chapter 7: Caching Read Hit Ratio: 82 Write Hit Ratio: 74 Prct Dirty Cache Pages = 63 Prct Cache Pages Owned = 50 Prct Read Flushes = 10 Prct Write Flushes = 12 Table 7-1 summarizes entries in the raid5 getcache output. Table 7-1 Output of raid5 getcache Entry...
  • Page 103 Viewing Cache Statistics Table 7-1 (continued) Output of raid5 getcache Entry Meaning Low Watermark Percentage of dirty cache pages which, when reached during flush operations, will cause the SP to cease flushing the cache. Valid values are 0 through 100; the default is 50, regardless of whether caching is enabled or disabled.
  • Page 104: Upgrading Challenge Raid To Support Caching

    Chapter 7: Caching Upgrading CHALLENGE RAID to Support Caching Please note these points before you enable caching: • Because disk modules are required in slots A0, B0, C0, D0, and E0 for caching, you might need to add disk modules to your storage system.
  • Page 105: Changing Unit Caching Parameters

    Changing Unit Caching Parameters Changing Unit Caching Parameters This section explains how to change cache parameters. Note: If caching is enabled, you must disable it before you can change any parameter. Disabling caching can affect storage system performance; do it only when system activity is relatively low.
  • Page 107: Technical Specifications

    Appendix A Technical Specifications This appendix lists the technical specifications for the CHALLENGE RAID deskside disk-array storage systems. Table A-1 summarizes specifications for the deskside storage system. Table A-1 CHALLENGE RAID Deskside Chassis Specifications Classification Specification Value AC power Voltage...
  • Page 108: Table A-2 Challenge Raid Rack Specifications

    Appendix A: Technical Specifications Table A-1 (continued) CHALLENGE RAID Deskside Chassis Specifications Classification Specification Value Physical Dimensions Height: 62.9 cm (24.75 in) Width: 35.6 cm (14.0 in) Depth: 76.2 cm (30.0 in) Minimum chassis weight Chassis with 5 disk modules, 1 SP, 2 VSCs, without packaging: 51.6 kg...
  • Page 109 Table A-2 (continued) CHALLENGE RAID Rack Specifications Classification Specification Value Operating limits Ambient temperature 10 degrees C to 38 degrees C (50 degrees F to 100 degrees F) Relative humidity 20 to 80% noncondensing Elevation 2439 m (8000 ft) Heat dissipation...
  • Page 111: The Raid5 Command Line Interface

    Reset statistics logging on the RAID storage-control processor (SP), setting all log counters to 0 firmware Update the firmware on the CHALLENGE RAID SP getagent Get names and descriptions of devices controlled by the SP getcache Get information about the storage system caching environment...
  • Page 112 Deconfigure physical disks from their current logical configuration, destroying all data on the logical unit (group) The raid5 command sends CHALLENGE RAID storage management and configuration requests to an application programming interface (API) on the CHALLENGE server. For the raid5 command to function, the agent—an interpreter between the command line interface and the CHALLENGE RAID storage system—must be running.
  • Page 113: Bind

    bind The environment variable RaidAgentDevice is the default value for the device when none is specified with the -d flag. If RaidAgentDevice is not set and no -d switch is present, an error is generated on all commands that need device information.
  • Page 114 Appendix B: The raid5 Command Line Interface • r1_0: RAID-1_0 • r5: RAID-5 • hs: hot spare lun-number Logical unit number to assign the unit (a hexadecimal number between 0 and F). disk-names Indicates which physical disks to bind, in the format bd, where b is the physical bus name (a through e) and d is the device number on the bus (0 through 3).
  • Page 115 bind A rebuild time of 2 hours rebuilds the disk more quickly but degrades response time slightly. A rebuild time of 0 hours rebuilds as quickly as possible but degrades performance significantly. If your site requires fast response time and you want to minimize degradation to normal I/O activity, you can extend the rebuilding process over a longer period of time, such as 24 hours.
  • Page 116: Chglun

    Appendix B: The raid5 Command Line Interface The following example binds A2 and B2 into a RAID-1 logical unit with a LUN number of 2 and a four-hour maximum rebuild time, with read cache enabled. raid5 -d sc4d2l0 bind r1 2 a2 b2 -r 4 -c read The following example binds disks A1, B1, C1, and D1 into a RAID-1_0 logical unit with a LUN number of 1, a four-hour maximum rebuild time, and a 128-block stripe size per physical disk, with read cache enabled.
  • Page 117 chglun In this syntax, variables mean the following: -l lun Logical unit number to be changed -c cache-flags Values are: • none: no caching • read: read caching • write: write caching • rw: read and write caching The default is none. -d default-owner Values are •...
  • Page 118: Clearlog

    You must be root to use this parameter. Note: This command has no output. firmware To update the firmware on the CHALLENGE RAID SP, type as root raid5 -d device firmware /usr/raid5/flarecode.bin The bus must be quiesced, with all applications disabled and only the Note: RAID agent running.
  • Page 119: Getagent

    getagent This command has no output. Once the microcode has been downloaded, each SP in the cabinet Note: reboots. the reboot process takes several minutes, during which time no command line interface commands are accepted except getagent. When the reboot is complete, getagent returns the new firmware version and all command line interface commands are accepted.
  • Page 120: Getcache

    Appendix B: The raid5 Command Line Interface Table B-2 (continued) Output of raid5 getagent Entry Meaning Node The /dev/scsi entry that the agent uses as a path to the actual SCSI device; this value must be entered by the user for every CLI command (except getagent) Signature Unique 32-bit identifier for the SP being accessed through Node...
  • Page 121: Table B-3 Output Of Raid5 Getcache

    getcache Write Hit Ratio: 74 Prct Dirty Cache Pages = 63 Prct Cache Pages Owned = 50 Prct Read Flushes = 10 Prct Write Flushes = 12 Table B-3 summarizes entries in the raid5 getcache output. Table B-3 Output of raid5 getcache Entry Meaning Usable Cache...
  • Page 122 Appendix B: The raid5 Command Line Interface Table B-3 (continued) Output of raid5 getcache Entry Meaning Low Watermark Percentage of dirty cache pages that, when reached during flush operations, will cause the SP to cease flushing the cache. Valid values are 0 through 100; the default is 50, regardless of whether caching is enabled or disabled.
  • Page 123: Getcontrol

    getcontrol getcontrol To get general system information, use raid5 -d device getcontrol A sample output of this command follows: System Fault LED: OFF Statistics Logging: ON System Cache: ON Max Requests: 23 Average Requests: 5 Hard errors: 0 Total Reads: 18345 Total Writes: 1304 Prct Busy: 25 Prct Idle: 75...
  • Page 124: Getdisk

    Appendix B: The raid5 Command Line Interface VSCC State: Present SPA State: Present SPB State: Present BBU State: Present Table B-4 interprets items in this output. Values for all entries of the output are Present or Not Present, except as noted. Table B-4 Output of raid5 getcrus Output...
  • Page 125: Figure B-1 Disk Module Locations

    getdisk Deskside Chassis assembly in rack A0 B0 C0 D0 E0 A2 B2 C2 D2 A1 B1 C1 D1 E1 A3 B3 C3 D3 E3 5 to 20 disk modules in groups of 5 Figure B-1 Disk Module Locations For example, the following command gets information disk A2. raid5 -d sc4d2l0 getdisk a2...
  • Page 126: Table B-5 Output Of Raid5 Getdisk

    Appendix B: The raid5 Command Line Interface A sample output of this command follows. A0 Vendor Id: SEAGATE Ao Product Id: ST15150N A0 Lun: 0 A0 State: Bound and Not Assigned A0 Hot Spare: NO A0 Prct Rebuilt: 100 A0 Prct Bound: 100 A0 Serial Number: 00019699 A0 Capacity: 0x000f42a8 A0 Private: 0x00009000...
  • Page 127 getdisk Table B-5 (continued) Output of raid5 getdisk Output Meaning State Removed: disk is physically not present in the chassis or has been powered off Off: disk is physically present in the chassis but is not spinning Powering Up: disk is spinning and diagnostics are being run on it Unbound: disk is healthy but is not part of a LUN Bound and Not Assigned: disk is healthy, part of a LUN, but not being used by this SP...
  • Page 128: Getlog

    Appendix B: The raid5 Command Line Interface getlog The SP maintains a log of event messages in its memory. These events include hard errors, startups, and shutdowns involving disk modules, fans, SPs, power supplies, and the battery backup unit. Periodically, the SP writes this log to disk to maintain it when SP power is off.
  • Page 129 getlog Table B-6 (continued) getlog Error Codes Error Code Explanation Too few parameters Too many parameters Invalid bind type Invalid LUN number Invalid rebuild time Invalid number of disks in bind command Valid disk names are of format a0, b1, ... e3, etc. Invalid stripe size Invalid disk name Invalid cache flags...
  • Page 130: Getlun

    Appendix B: The raid5 Command Line Interface Table B-6 (continued) getlog Error Codes Error Code Explanation LUN does not exist LUN already exists Cannot get current working directory for firmware command Agent encountered an error during SCSI execution Agent encountered an error during serial port execution Agent returned an operating system error Agent returned an internal agent error code Cannot communicate with agent...
  • Page 131 getlun Type: RAID5 Stripe size: 128 Capacity: 0x10000 Current owner: YES Auto-trespass: Disabled Auto-assign; Enabled Write cache: Disabled Read cache: Disabled Idle Threshold: 0 Idle Delay Time: 20 Write Aside Size: 2048 Default Owner: YES Rebuild Time: 0 Read Hit Ratio: 0 Write Hit Ratio: 0 Prct Reads Forced Flushed: 0 Prct Writes Forced Flushed: 0...
  • Page 132: Table B-7 Output Of Raid5 Getlun

    Appendix B: The raid5 Command Line Interface D0 Enabled D0 Reads: 68558 [etc.] E0 Enabled Eo Reads: 69721 [etc.] Table B-7 summarizes entries in the raid5 getlun output. Table B-7 Output of raid5 getlun Entry Meaning Type RAID0, RAID1, RAID10, RAID5, or SPARE Stripe Size Sectors per disk per stripe with which the unit was bound Capacity...
  • Page 133 getlun Table B-7 (continued) Output of raid5 getlun Entry Meaning Read Hit Ratio Percentage of read requests to the controller that can be satisfied from the cache without requiring disk access Write Hit Ratio Percentage of write requests to the cache that can be satisfied with the cache without requiring a disk access Prct Reads Forced Flushed Percentage of read requests that flushed the cache...
  • Page 134: Setcache

    Appendix B: The raid5 Command Line Interface setcache The cache parameters you specify for the entire storage system are the cache size of 8 or 64 MB, depending on the amount of memory the SP has, and the cache page size, as 2, 4, 8, or 16 KB. To set up caching, use the raid5 setcache command: raid5 -d device setcache enable | disable [-u usable] [-p page] [-l low] [-h high] In this syntax, variables mean:...
  • Page 135: Unbind

    unbind You can change the cache size, the cache page size values, or the type of caching for any physical disk unit without affecting the information stored on it. Follow these steps: 1. Disable the cache: raid5 -d device setcache 0 2.
  • Page 136 Appendix B: The raid5 Command Line Interface In this syntax, variables and options mean: lun-number Number of the logical unit (LUN) to deconfigure. When raid5 unbind is entered, a prompt appears asking the user for verification before the unbind is issued. This flag disables this prompt.
  • Page 137: Index

    Index CHALLENGE server system xi, 4 chassis agent, see getagent assemblies, on one SCSI bus auto-reassign 82-84 front view chglun 56-57, 102-104 clearlog 44, 104 clearstats CLI, see command line interface basic configuration command line interface 21-22, 97-122 battery backup unit component 5-10 replacing...
  • Page 138 Index device 37, 49, 98 name node, creating electrostatic discharge damage (ESD), avoiding disk module environment variable A0, A3, B0, C0, D0, E0 error codes 114-116 moving 14, 90 event log A0, B0, C0, D0, E0 clearing failed displaying 43-44 failed, and caching removing required for caching...
  • Page 139 53-56 rebuild time 71, 100 number, assigning bind chglun 56, 103 restarting system 45-47 model number for disk module multiple CHALLENGE RAID storage systems SCSI-2 and disk modules length limit operation 35-47 interface service light setcache 86-87, 91, 120-121 shutting down system...
  • Page 140 Index SP, see storage-control processor specifications 93-95 split-bus configuration 28-30, 57 status checking light storage-control processor failed and caching powering off replacing 81-84 status striping default size RAID-1_0 16-17 RAID-5 18-19 size 51, 101 system information unbind 64, 121 VSC, see power supply...
  • Page 142 Tell Us About This Manual As a user of Silicon Graphics documentation, your comments are important to us. They help us to better understand your needs and to improve the quality of our documentation. Any information that you provide will be useful. Here is a list of suggested topics to comment on: •...

Table of Contents