Digital Equipment StorageWorks HSZ50 Service Manual

Digital storageworks array controller
Table of Contents

Advertisement

Quick Links

DIGITAL StorageWorks
HSZ50 Array Controller
HSOF Version 5.1
Service Manual
Part Number: EK-HSZ50-SV.C01
March 1997
Software Version:
HSOF Version 5.1
Digital Equipment Corporation
Maynard, Massachusetts

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the StorageWorks HSZ50 and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for Digital Equipment StorageWorks HSZ50

  • Page 1 DIGITAL StorageWorks HSZ50 Array Controller HSOF Version 5.1 Service Manual Part Number: EK-HSZ50-SV.C01 March 1997 Software Version: HSOF Version 5.1 Digital Equipment Corporation Maynard, Massachusetts...
  • Page 2 March, 1997 While Digital Equipment Corporation believes the information included in this manual is correct as of the date of publication, it is subject to change without notice. DIGITAL makes no representations that the interconnection of its products in the manner...
  • Page 3 Warning! This is a Class A product. In a domestic environment this product may cause radio interference in which case the user may be required to take adequate measures. Achtung! Dieses ist ein Gerät der Funkstörgrenzwertklasse A. In Wohnbereichen können bei Betrieb dieses Gerätes Rundfunkstörungen auftreten, in welchen Fällen der Benutzer für entsprechende Gegenmaßnahmen verantwortlich ist.
  • Page 5: Table Of Contents

    1 Troubleshooting Introduction ......................1–2 Interpreting controller LED codes................1–2 Troubleshooting HSZ50 controllers ................ 1–7 Troubleshooting when you cannot access host units ......... 1–7 Troubleshooting on a DIGITAL UNIX system ..........1–8 Using the DIGITAL UNIX file utility ............. 1–9 OpenVMS host troubleshooting..............
  • Page 6 DILX data patterns..................1–56 Monitoring system performance with the VTDPY utility ........1–57 How to Run VTDPY..................1–57 Using the VTDPY Control Keys ..............1–58 Using the VTDPY Command Line..............1–58 How to Interpret the VTDPY Display Fields ..........1–60 SCSI Host port Characteristics..............
  • Page 7 Preparing to replace the second ECB ............ 2–33 Replacing the second ECB..............2–33 Reinstalling the modules ............... 2–34 Restarting the subsystem............... 2–36 Replacing ECBs using the off-line method............. 2–37 Replacing power supplies..................2–39 Cold-swap...................... 2–39 Removing the power supply..............2–39 Installing the new power supply ............
  • Page 8 viii Running the CLCP utility ..............3–16 The dual-redundant, sequential upgrade method ..........3–19 Special considerations for the sequential code load upgrade method .................. 3–19 Sequential upgrade procedure ................ 3–21 The dual-redundant concurrent code load upgrade method ......................3–21 Considerations for the concurrent code load upgrade method ......................
  • Page 9 4 Moving storagesets and devices Precautions for retaining data.................. 4–2 Moving storagesets ....................4–3 Moving storageset members..................4–6 Moving a single disk-drive unit................4–8 Moving a tape drive, CD-ROM drive, or tape loader..........4–9 5 Removing Removing a patch ....................5–2 Removing a controller and cache module..............
  • Page 10 Figure 2–9 Removing the power supply............. 2–38 Figure 2–10 Power supply fault indicators ............2–39 Figure 2–11 Removing a disk drive ..............2–41 Figure 2–12 Default indicators for 3.5- and 5.25-inch SBBs ....... 2–42 Figure 2–13 OCP LED patterns ................2–43 Figure 2–14 Removing the CD-ROM drive............
  • Page 11 Table 2–2 ECB status indicators ................. 2–16 Table 2–3 ECB status indicators ................. 2–26 Table 2–4 ECB status indicators ................. 2–36 Table 3–1 Abort codes ..................3–39 Table 3–2 SCSI ID Slots ..................3–43 Table 3–3 ECB status indicators ................. 3–46 Table 3–4 Adding cache memory capacity............
  • Page 12 Table A–24 Format and device code load utility (HSUTIL) last failure codes....................A-89 Table A–25 Code load/code patch utility (CLCP) last failure codes ....................... A-90 Table A–26 Induce controller crash utility (CRASH) last failure codes ....................... A-90 Table A–27 Repair action codes ................. A-91...
  • Page 13: Related Documents

    xiii Related documents The following table lists documents that contain information related to this product. Document title Part number DECevent Installation Guide AA–Q73JA–TE StorageWorks BA350–MA Controller Shelf User's EK–350MA–UG Guide StorageWorks Configuration Manager for DEC AA–QC38A–TE OSF/1 Installation Guide StorageWorks Configuration Manager for DEC AA–QC39A–TE OSF/1 System Manager's Guide for HSZterm StorageWorks Solutions Configuration Guide...
  • Page 15: Troubleshooting

    Troubleshooting Interpreting controller LED codes Troubleshooting controllers Using FMU to describe event log codes Testing disk drives Monitoring subsystem performance HSZ50 Array Controller Service Manual...
  • Page 16 Introduction This chapter is designed to help you quickly isolate the source of any problems you might encounter when you service the StorageWorks HSZ50 controllers, and take the necessary steps to correct the problems. Interpreting controller LED codes This section provides information on how to interpret controller LED codes.
  • Page 17: Table 1-1 Solid Controller Led Codes

    Reset the controller. initialization completed O P P P P P P No program card seen Try the card in another module. If the problem follows the card, replace the card. Otherwise, replace the controller. HSZ50 Array Controller Service Manual...
  • Page 18: Table 1-2 Flashing Controller Led Codes

    O P M P P M P The controller DRAB or Replace controller DRAC chip does not arbitrate module. correctly Service Manual HSZ50 Array Controller...
  • Page 19 DRAC chip did not interrupt module. the controller processor when expected O P M M M P P The controller DRAB or Replace controller DRAC chip did not report an module. NXM error when nonexistent memory was accessed HSZ50 Array Controller Service Manual...
  • Page 20 Replace controller shelf that the cache module does backplane. not exist, but access to that cache module did not cause an error O M M P P P P The journal SRAM battery is Replace controller module. Service Manual HSZ50 Array Controller...
  • Page 21: Troubleshooting Hsz50 Controllers

    O M M M M M M An illegal process was Replace controller activated during initialization module. Troubleshooting HSZ50 controllers This section covers the following topics: • Troubleshooting when you cannot access HSZ units. • Troubleshooting on DIGITAL UNIX •...
  • Page 22: Troubleshooting On A Digital Unix System

    HSZ console. (If this is a dual controller configuration, the command must be executed on both controllers.) 1. To determine if the unit is on-line to a controller: HSZ50> SHOW UNITS FULL 2. Check the following: –...
  • Page 23: Using The Digital Unix File Utility

    The host system should display the following output after the file command is issued (the output displays on one line): /dev/rrzb17a character special (8/mmmm) SCSI # n HSZ50 disk #xxx (SCSI ID #t) The output values have the following meanings: –...
  • Page 24: Openvms Host Troubleshooting

    1–10 Troubleshooting • t - target ID as used in the HSZ50 unit DTZL where the “T”. In the DTZL HSZ50 unit matches the “t” from the file command. • xxx - the disk number 4. If an error occurs, use the information in the following table to...
  • Page 25: Troubleshooting Application Errors

    This section contains an example of a DECevent error log for a device event or error. You should be able to locate the following important details in the DECevent error log when a device event HSZ50 Array Controller Service Manual...
  • Page 26 #dia -i ios -t s:03-oct-1995, 10:47 e:03-oct-1995, 10:48 DECevent Log Example - Locating a Device Error *************************ENTRY 4************************** Logging OS 2. DIGITAL UNIX System Architecture 2. Alpha Event sequence number Timestamp of occurrence 03-OCT-1995 10:47:59 Host name testsys Service Manual HSZ50 Array Controller...
  • Page 27 ------- CAM Data ------- Class Disk Subsystem Disk Number of Packets ------ Packet Type ------ 258. Module Name String Routine Name cdisk_bbr_done ------ Packet Type ------ 256. Generic String cdisk_bbr: BBR disabled bad block number: 230262 HSZ50 Array Controller Service Manual...
  • Page 28 1. SCSI I/O Request CCB(CCB_SCSIIO) Packet Revision CCB Address xFFFFFC0007F9BB28 CCB Length x00C0 XPT Function Code Execute requested SCSI I/O Cam Status CCB Request Completed WITH Error Autosense Data Valid for Target Path ID Target ID Target LUN Service Manual HSZ50 Array Controller...
  • Page 29 (CDB) Command & Data Buf 15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order 0000: 00000000 00000010 7083030A *...p..* Timeout Value x0000003C *msg_ptr x0000000000000000 Message Length Vendor Unique Flags x4000 Tag Queue Actions Tag for Simple Queue HSZ50 Array Controller Service Manual...
  • Page 30 Disk Transfer Error. Template Flags HCE = 1, Event occurred during Host Command Execution. Ctrl Serial # ZG41800293 Ctrl Software Revision V20Z RAIDSET State NORMAL. All members present and reconstructed, IF LUN is configured as a RAIDSET. Service Manual HSZ50 Array Controller...
  • Page 31 UWEUO = 0, not defined. MSBD = 0, not defined. FBW = 0, not defined. IDSD = 0, Valid Device Sense Data fields. DSSD = 1, Device Sense Data fields supplied by Physical Device. HSZ50 Array Controller Service Manual...
  • Page 32 No Additional Sense Information FRU Code Sense Key Specific Byte 0 Sense Key Data NOT Valid Byte 1 Byte 2 -- Device Sense Data -- Error Code Current Error Information Bytes are Valid Segment # Service Manual HSZ50 Array Controller...
  • Page 33: Controller Generated Event

    No device ASC or ASCQ information displays for this type of error. The following important information is highlighted in the example: • Unit Information, Port-Target-LUN • CAM Status • SCSI Status • Command Information • Actual Error HSZ50 Array Controller Service Manual...
  • Page 34 Event severity 3. High Priority Entry type 199. CAM SCSI Event Type ------- Unit Info ------- Bus Number Unit Number x0080 Target = LUN = ------- CAM Data ------- Class Disk Subsystem Disk Number of Packets Service Manual HSZ50 Array Controller...
  • Page 35 256. Generic String Active CCB at time of error ------ Packet Type ------ 256. Generic String CCB request completed with an error ------ Packet Type ------ 1. SCSI I/O Request CCB(CCB_SCSIIO) Packet Revision CCB Address xFFFFFC00071D2328 HSZ50 Array Controller Service Manual...
  • Page 36 Auotsense Byte Length 160. CDB Length Scatter/Gather Entry Cnt SCSI Status Check Condition Autosense Residue Length Transfer Residue Length x00000000 (CDB) Command & Data Buf 15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order 0000: 00000000 00000001 00000008 *...* Service Manual HSZ50 Array Controller...
  • Page 37 Re-writing the disk block will clear the forced error condition. The Device Sense Data Information Bytes contain the block number of the first block in error. HSZ50 Array Controller Service Manual...
  • Page 38 Next Most Recent ASCQ Device Locator x000403 Port Target Command Opcode Read (6 byte) Original CDB 15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order 0000: 00070000 00000001 00000008 * ..B* SCSI Host ID Drive Software Revision 427H Service Manual HSZ50 Array Controller...
  • Page 39 Current Error Information Bytes are Valid Segment # Information Byte 3 Byte 2 Byte 1 Byte 0 Sense Key Medium Error Additional Sense Length CMD Specific Info Byte 3 Byte 2 Byte 1 Byte 0 HSZ50 Array Controller Service Manual...
  • Page 40: Locating A Host Bus Error

    No Sense Data available DECevent Log Example - Command Timeout ************************* ENTRY 390 ************************* Logging OS 2. DIGITAL UNIX System Architecture 2. Alpha Event sequence number 118. Timestamp of occurrence 29-MAY-1996 20:02:09 Host name tgonzo Service Manual HSZ50 Array Controller...
  • Page 41 Class Disk Subsystem Disk Number of Packets ------ Packet Type ------ 258. Module Name String Routine Name cdisk_complete ------ Packet Type ------ 256. Generic String Retries Exhausted ------ Packet Type ------ 260. Hardware Error String HSZ50 Array Controller Service Manual...
  • Page 42 Execute requested SCSI I/O Cam Status Command Timeout Path ID Target ID Target LUN Cam Flags x00000482 SIM Queue Actions are Enabled Data Direction (10: DATA OUT) Disable the SIM Queue Frozen State *pdrv_ptr xFFFFFC002B420C28 Service Manual HSZ50 Array Controller...
  • Page 43 (CDB) Command & Data Buf 15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order 0000: 00000000 0000F05A F200002A *...Z..* Timeout Value x0000003C *msg_ptr x0000000000000000 Message Length Vendor Unique Flags x4000 Tag Queue Actions Tag for Simple Queue HSZ50 Array Controller Service Manual...
  • Page 44: Select Timeout (Scsi Protocol Timeout)

    Event severity 3. High Priority Entry type 199. CAM SCSI Event Type ------- Unit Info ------- Bus Number Unit Number x0088 Target = LUN = ------- CAM Data ------- Class Disk Subsystem Disk Number of Packets Service Manual HSZ50 Array Controller...
  • Page 45 ------ Packet Type ------ 256. Generic String Active CCB at time of error ------ Packet Type ------ 256. Generic String Target selection timeout ------ Packet Type ------ 1. SCSI I/O Request CCB(CCB_SCSIIO) Packet Revision CCB Address xFFFFFC0005997F28 HSZ50 Array Controller Service Manual...
  • Page 46 CDB Length Scatter/Gather Entry Cnt SCSI Status Good Condition Autosense Residue Length Transfer Residue Length x00000000 (CDB) Command & Data Buf 15--<-12 11--<-08 07--<-04 03--<-00 :Byte Order 0000: 00000000 00000010 00D4010A ..* Timeout Value x0000003C Service Manual HSZ50 Array Controller...
  • Page 47: Identifying Unit Attention Errors

    DECevent Log Example - Unit Attention Error (OpenVMS) ************************* ENTRY 1 ************************* Logging OS 1. OpenVMS System Architecture 2. Alpha OS version V6.2-1H2 Event sequence number 639. Timestamp of occurrence 03-APR-1996 16:50:17 Time since reboot 0 Day(s) 0:53:17 Host name TGONZO HSZ50 Array Controller Service Manual...
  • Page 48 VMS SCSI Error Type 5. Extended Sense Data from Device !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! This is unit 201 at port 5 target 1 SCSI ID SCSI LUN SCSI SUBLUN Port Status x00000001 Success Command Opcode Write (6 byte) Command Data Service Manual HSZ50 Array Controller...
  • Page 49 Ctrl Software Revision V51Z RAIDSET State NORMAL. All members present and reconstructed, IF LUN is configured as a RAIDSET. Error Code Current Error Sense Key Unit Attention ASC & ASCQ x3F85 x003F ASCQ = x0085 HSZ50 Array Controller Service Manual...
  • Page 50 512. Byte Page Offset IRP, x_BCNT 8192. Transfer Size In Byte(s) UCB, x_ERRCNT 4. Errors This Unit UCB, L_OPCNT 337624. QIO's This Unit ORB, L_OWNER x00010004 Owners UIC UCB, L_DEVCHAR1 x1C4D4008 Directory Structured File Oriented Service Manual HSZ50 Array Controller...
  • Page 51: Digital Unix Unit Attention

    Logging OS 2. DIGITAL UNIX System Architecture 2. Alpha Event sequence number Timestamp of occurrence 24-JAN-1996 17:19:01 Host name tgonzo System type register x00000004 DEC 3000 Number of CPUs (mpnum) x00000001 CPU logging event (mperr) x00000000 HSZ50 Array Controller Service Manual...
  • Page 52 ------ Packet Type ------ 256. Generic String Event - Unit Attention ------ Packet Type ------ 261. Soft Error String Error Type Soft Error Detected (recovered) ------ Packet Type ------ 257. Device Name String Device Name HSZ5 Service Manual HSZ50 Array Controller...
  • Page 53 Autosense Data Valid for Target Path ID Target ID Target LUN Cam Flags x00000442 SIM Queue Actions are Enabled Data Direction (01: DATA Disable the SIM Queue Frozen State *pdrv_ptr xFFFFFC0004F83828 *next_ccb x0000000000000000 *req_map xFFFFFC0007F8C200 HSZ50 Array Controller Service Manual...
  • Page 54 Tag Queue Actions Tag for Simple Queue ------ Packet Type ------ 256. Generic String Error, exception, or abnormal condition ------ Packet Type ------ 256. Generic String UNIT ATTENTION - Medium changed or target reset Service Manual HSZ50 Array Controller...
  • Page 55 NORMAL. All members present and reconstructed, IF LUN is configured as a RAIDSET. Error Code Current Error Sense Key Unit Attention ASC & ASCQ xD203 x00D2 ASCQ = x0003 Device services had to reset the bus. HSZ50 Array Controller Service Manual...
  • Page 56: Using Fmu To Describe Event Log Codes

    • LAST_FAILURE_CODE • ASC_ASCQ_CODE • COMPONENT_CODE • CONTROLLER_UNIQUE_ASC_ASCQ_CODE • DEVICE_TYPE_CODE • EVENT_THRESHOLD_CODE • RESTART_TYPE • SCSI_COMMAND_OPERATION_CODE • SENSE_DATA_QUALIFIERS • SENSE_KEY_CODE • TEMPLATE_CODE To translate a code: 1. Start FMU from the CLI: HSZ50> RUN FMU Service Manual HSZ50 Array Controller...
  • Page 57 FMU> DESCRIBE code-type code-number [additional numbers] Following is an example of how to use the describe command and a sample display: HSZ50> RUN FMU Fault Management Utility FMU> DESCRIBE INSTANCE,_CODE 030C4002 Instance, Code: 030C4002 Description: A Drive failed because a Test Unit Ready command or a Read Capacity command failed.
  • Page 58: Fmu Command Example

    Use the following procedure to view a last failure or memory system failure code: 1. Start FMU from the CLI: HSZ50> RUN FMU 2. To see all of the stored last failure or memory system failure events: FMU> DESCRIBE LAST_FAILURE ALL FMU>...
  • Page 59: Fmu Output Example

    FMU> DESCRIBE LAST_FAILURE n FMU> DESCRIBE MEMORY_SYSTEM_FAILURE n where n is the stored event number from 1-4. FMU Output Example HSZ50> RUN FMU Fault Management Utility FMU> SHOW LAST_FAILURE MOST_RECENT Last Failure Entry: 1. Flags: 000FF301 Template: 1.(01) Description: Last Failure Event...
  • Page 60: Testing Disks (Dilx)

    This is a 10-minute read-only test that uses the default DILX settings. 1. Start DILX from the CLI prompt: HSZ50> RUN DILX 2. Skip the auto-configure option so you can specify which disk drives to test:...
  • Page 61: Running An Initial Test On All Disks

    10-minute cycle consisting of 8 minutes of random I/O and 2 minutes of data-intensive transfers. You can set the duration of the test. 1. Start DILX from the CLI prompt: HSZ50 Array Controller Service Manual...
  • Page 62 1–48 Troubleshooting HSZ50> RUN DILX 2. Choose the auto-configure option to test all single-disk units: Do you wish to perform an Auto-configure (y/n) [n]? y 3. Choose option 1 (test all disks) if you have a single-controller system; choose option 2 (test half of the disks) if you have a...
  • Page 63: Running A Disk Basic Function Test

    1. Start DILX from the CLI prompt: HSZ50> RUN DILX 2. Skip the auto-configure option to get to the basic function test: Do you wish to perform an Auto-configure (y/n) ? n 3.
  • Page 64 Enter test number (1:2) [1] ? 1 __________________Caution_________________ If you choose to write-enable disks during the test, make sure that the disks do not contain customer data. __________________________________________ 1. Set the test as read-only or read/write: Service Manual HSZ50 Array Controller...
  • Page 65 7. The system displays a list of all single-disk units (by unit number) you can choose for DILX testing. Select the first disk that you want to test. Do not include the letter “D” in the unit number. Enter unit number to be tested? 350 HSZ50 Array Controller Service Manual...
  • Page 66: Running An Advanced Disk Test

    2. Skip the auto-configure option to get to the user-defined test: Do you wish to perform an Auto-configure (y/n) ? n 3. Do not accept the default settings: Use all defaults and run in read only mode (y/n)? n Service Manual HSZ50 Array Controller...
  • Page 67 *** Available tests are: 1. Basic Function 2. User Defined Use the Basic Function test 99.9% of the time. The User Defined test is for special problems only. Enter test number (1:2) [1] ? 2 HSZ50 Array Controller Service Manual...
  • Page 68 DILX testing started at <date> <time> Test will run for <nn> minutes 6. DILX will run for the amount of time that you selected and then display the results of the testing. If you want to interrupt the test early: Service Manual HSZ50 Array Controller...
  • Page 69: Dilx Error Codes

    DILX read a legal data pattern from the disk at a place where DILX wrote to the disk, but DILX does not have any write buffers that correspond to the data pattern. Thus, the data has been corrupted. HSZ50 Array Controller Service Manual...
  • Page 70: Dilx Data Patterns

    7, alternating 1s, 0s 0000, 0000, 0000, FFFF, FFFF, FFFF, 0000, 0000, FFFF, FFFF, 0000, FFFF, 0000, FFFF, 0000, FFFF B6D9 5555, 5555, 5555, AAAA, AAAA, AAAA, 5555, 5555, AAAA, AAAA, 5555, AAAA, 5555, AAAA, 5555, AAAA, 5555 Service Manual HSZ50 Array Controller...
  • Page 71: Monitoring System Performance With The Vtdpy Utility

    You can run only one VTDPY session on each controller at one time. Prior to running VTDPY, set the terminal to NOWRAP mode to prevent the top line of the display from scrolling off of the screen. HSZ50 Array Controller Service Manual...
  • Page 72: Using The Vtdpy Control Keys

    VTDPY contains a command line interpreter that you can invoke by entering Ctrl/C any time after starting the program. The command line interpreter is used to modify the characteristics of the VTDPY display. Table 1–5 lists the VTDPY commands. Service Manual HSZ50 Array Controller...
  • Page 73: Table 1-5 Vtdpy Commands

    If an error occurs in the command, the user prompts for command expansion help, or the HELP command is entered, the command line interpreter prompts for an additional command instead of returning to the display. HSZ50 Array Controller Service Manual...
  • Page 74: How To Interpret The Vtdpy Display Fields

    – Host port – SCSI bus configuration – SCSI termination – SCSI cables – Async indicates communication between this target and all initiators is being done in asynchronous mode. This is the Service Manual HSZ50 Array Controller...
  • Page 75: Device Scsi Status

    – P indicates pass-through device support (i.e., tape or media loader). – A period (.) indicates the device type is unknown. – A space indicates there is no device configured at this location. HSZ50 Array Controller Service Manual...
  • Page 76: Unit Status (Abbreviated)

    The availability state is indicated using the following letters: – a — Available. The available state indicates a problem. HSZ units will show on-line if a problem does not exist. Service Manual HSZ50 Array Controller...
  • Page 77 > — For disks, this symbol indicates the device is spinning up. For tapes, it indicates the tape is loading. – < — For disks, this symbol indicates the device is spinning down. For tapes, it indicates the tape is unloading. HSZ50 Array Controller Service Manual...
  • Page 78 This data is contained only in the DEFAULT display for disk and tape device types. HT% — This column indicates the cache hit percentage for data transferred between the host and the unit. Service Manual HSZ50 Array Controller...
  • Page 79: Unit Status (Full)

    For HSZ controllers, on-line in this column means that the unit is on-line to the HSZ controller only. It does not indicate that the unit is mounted by the host. HSZ50 Array Controller Service Manual...
  • Page 80 — Off-line, No Volume Mounted. The device does not contain media. – x — On-line to other controller. Not available for use by this controller. – A space in this column indicates the availability is unknown. Service Manual HSZ50 Array Controller...
  • Page 81 Wr% — This column indicates what percentage of data transferred between the host and the unit were written to the unit. This data is only contained in the DEFAULT display for disk and tape device types. HSZ50 Array Controller Service Manual...
  • Page 82: Device Status

    BlHit — This column shows the number of cached data blocks “hit” in the last update interval. Device Status ASWF Rq/S RdKB/S WrKB/S D100 D120 D140 D210 D230 D300 D310 D320 D400 D410 D420 D430 D440 D450 D500 D510 D520 D530 Service Manual HSZ50 Array Controller...
  • Page 83 < — For disks, this symbol indicates the device is spinning down. For tapes, it indicates the tape is unloading. – v — For disks, this symbol indicates the device is stopped. For tapes, it indicates the tape is unloaded. HSZ50 Array Controller Service Manual...
  • Page 84 VTDPY was started. BR — This column indicates the number of SCSI bus resets that occurred since VTDPY was started. TR — This column indicates the number of SCSI target resets that occurred since VTDPY was started. Service Manual HSZ50 Array Controller...
  • Page 85: Device Scsi Port Performance

    TR — This column indicates the number of SCSI target resets that occurred since VTDPY was started. Help Example VTDPY> HELP Available VTDPY commands: ^C - Prompt for commands ^G or ^Z - Update screen ^O - Pause/Resume screen updates ^Y - Terminate program HSZ50 Array Controller Service Manual...
  • Page 86 HELP - Display this help message REFRESH - Refresh the current display QUIT - Terminate program (same as EXIT) UPDATE - Update screen display VTDPY> Description This is the sample output from executing the HELP command. Service Manual HSZ50 Array Controller...
  • Page 87: Replacing Field-Replaceable Units

    Replacing dual-redundant controllers and cache modules using the off-line method Replacing external cache batteries (ECBs) Replacing power supplies Replacing disk drives Replacing tape drives Replacing solid state disks and CD-ROM drives Replacing host and device cables HSZ50 Array Controller Service Manual...
  • Page 88: Electrostatic Discharge

    2–2 Replacing field-replaceable units Introduction and precautions This chapter describes the procedures for replacing HSZ50 field replaceable units. The following sections provide important information to prevent damage to system components you must handle during replacement procedures, and to ensure you have the tools you need to replace system components.
  • Page 89: Handling Controller Host-Port Cables

    SWAP running utilities and disable all other terminals. _________________________________________ This section describes the replacement procedures for the HSZ50 controllers and cache modules using the C_SWAP (warm swap) procedure. ____________________Note ________________ Use the C_SWAP procedure when you cannot shut down the system and only in dual-redundant configurations.
  • Page 90: Preparing The Subsystem

    To PC H8571-J BC16E-XX To terminal CXO-5293A-MC 2. Enter the following command at the CLI: HSZ50> SHOW_THIS CONTROLLER Record the preferred IDs and the host port SCSI target IDs to use later in this procedure. Service Manual HSZ50 Array Controller...
  • Page 91 2–5 Prefer all target IDs to this controller by entering the following command: HSZ50> SET THIS_CONTROLLER PREFERRED_ID=(n,n,n,n) where n,n,n, n are equal to all host port SCSI target IDs noted in Step 3. 5. Enter the following command at the CLI: HSZ50>...
  • Page 92: Figure 2-2 Disconnecting The Trilink Connector

    Do not remove the module yet. 13. If you are removing the cache module, loosen the captive retaining screws on the cache module’s front bezel. 14. Start the C_SWAP program by entering the following command: HSZ50> RUN C_SWAP Service Manual HSZ50 Array Controller...
  • Page 93: Removing The Controller And Cache Modules

    Use the following procedure to remove the controller and cache modules: 1. When the controller prompts you with the following question: Do you wish to remove the other HSZ50 Y/N [N] ? Enter “Y” for YES and press Return. Do not remove the controller module yet.
  • Page 94: Figure 2-3 Removing The Program Card

    – If you are replacing the controller, remove the program card and save it for use in the replacement controller. Figure 2–3 Removing the program card cover Eject button PCMCIA card CXO-5302A-MC Service Manual HSZ50 Array Controller...
  • Page 95 ECB from the cache module. If you are removing the cache module, disconnect the battery from the cache side only. 8. Disable the ECB by pressing the battery disable switch. See Figure 2–4. HSZ50 Array Controller Service Manual...
  • Page 96: Figure 2-4 Disconnecting The Battery Cable And Disabling The Ecb

    2–10 Replacing field-replaceable units Figure 2–4 Disconnecting the battery cable and disabling the Battery disable switch CXO-5360A-MC 9. Slide the defective controller out of the shelf and note its location. See Figure 2–5. Service Manual HSZ50 Array Controller...
  • Page 97: Figure 2-5 Removing Controllers And Cache Modules

    You may remove the cache module before or after port activity has restarted. Do not proceed with the procedures for reinstalling the controller and cache modules until you see the message in Step 10. _________________________________________ HSZ50 Array Controller Service Manual...
  • Page 98: Reinstalling The Controller Subsystem Components

    (both replacement and existing modules). Press Return. 2. The following question displays: ***Sequence to INSERT the other HSZ has begun.*** Do you wish to INSERT the other HSZ [N] ? Enter “Y” for YES. Service Manual HSZ50 Array Controller...
  • Page 99 See Figure 2–6. 6. Insert the controller module by sliding it straight in along the rails and then push firmly to seat it in the backplane. See Figure 2–6. HSZ50 Array Controller Service Manual...
  • Page 100: Figure 2-6 Installing Controllers And Cache Modules

    2–14 Replacing field-replaceable units Figure 2–6 Installing controllers and cache modules Cache module Controller CXO-5283A-MC Service Manual HSZ50 Array Controller...
  • Page 101 . _________________________________________ Controller Warm Swap terminated. The configuration has two controllers. To restart the other HSZ50. 1) Enter the command RESTART OTHER_CONTROLLER. 2) Press and hold in the Reset (//) button while inserting the program card.
  • Page 102: Restarting The Subsystem

    3. Release the Reset button to initialize the controller. Wait for the CLI prompt (HSZ50>) to appear at the terminal. You will see a “Controllers misconfigured” message, which you can ignore. 4. Enter the following command at the CLI: HSZ50>...
  • Page 103: Table 2-2 Ecb Status Indicators

    If the battery status is low, you may want to set the cache policy. Refer to the procedure documented in the HSZ50 Array Controller HSOF 5.1 CLI Reference Manual. 11. Verify that all controller settings are correct by entering the following commands: HSZ50>SHOW THIS_CONTROLLER...
  • Page 104: Replacing A Controller And Cache Module In A Single Controller Configuration

    15. If you wish to balance the I/O load, as it was before the controller replacement, enter the following command: HSZ50> SET OTHER_CONTROLLER PREFERRED_ID =(n,n) Where n = preferred IDs that were shown on the controller that did NOT require service.
  • Page 105 4. Record all instance, and failure codes and remember the order. Exit the FMU utility. 5. Take the controller out of service. HSZ50> SHUTDOWN THIS_CONTROLLER To ensure that the controller has shut down cleanly, check for the following indications on the controller’s operator control panel (OCP): –...
  • Page 106: Figure 2-7 Disabling The Ecb

    Remove the program card and save it for the replacement controller. See Figure 2–3. 11. Loosen the captive screws on the trilink connector and remove the trilink. See Figure 2–2. 12. Loosen the captive retaining screws on the controller’s front bezel. Service Manual HSZ50 Array Controller...
  • Page 107: Reinstalling Controller Subsystem Components

    6. Attach a maintenance terminal to the new controller. 7. Press and hold the controller’s green reset (//) button, while inserting the program card. The program card eject button will extend when the card is fully inserted. See Figure 2–8. HSZ50 Array Controller Service Manual...
  • Page 108: Figure 2-8 Installing The Program Card

    2–22 Replacing field-replaceable units Figure 2–8 Installing the program card cover Eject button PCMCIA card CXO-5302A-MC Service Manual HSZ50 Array Controller...
  • Page 109 11. At the CLI prompt type: HSZ50> SHOW THIS_CONTROLLER The controller displays the following information (this is a sample only): Controller: HSZ50-AX ZG34901786 Firmware V51Z Hardware AX11 Not configured for dual-redundancy SCSI address 7 Time: 04 FEB-1997 16:32:54 Host port:...
  • Page 110: Replacing Dual-Redundant Controllers And Cache Modules

    1. In dual-redundant mode, when one controller fails, connect a maintenance terminal to the surviving controller. 2. Enter the following command at the CLI: HSZ50> SHOW_THIS CONTROLLER Record the preferred IDs and the host port SCSI target IDs to use later in this procedure.
  • Page 111: Reinstalling Subsystem Components

    Make sure you use the correct slot. Slide the new controller module into the shelf using the same rails from which you removed the module. See Figure 2–6. HSZ50 Array Controller Service Manual...
  • Page 112 Enter the following command at the CLI: HSZ50>SHOW THIS_CONTROLLER Look for invalid cache errors. To clear the errors, first use the following command: HSZ50> CLEAR_ERRORS THIS_CONTROLLER INVALID_CACHE NODESTROY_UNFLUSHED_DATA. If there are still invalid cache errors, use the following command to clear the errors: HSZ50>CLEAR_ERRORS THIS_CONTROLLER...
  • Page 113: Table 2-3 Ecb Status Indicators

    If the battery status is low, you may want to set cache policy. Refer to the procedure documented in the HSZ50 Array Controller HSOF 5.1 CLI Reference Manual. 14. Verify that all controller settings are correct by entering the following commands: HSZ50>SHOW THIS_CONTROLLER...
  • Page 114: Replacing External Cache Batteries (Ecbs)

    18. If you wish to balance the I/O load, as it was before the controller replacement, enter the following command: HSZ50> SET OTHER_CONTROLLER PREFERRED_ID =(n,n) Where n = preferred IDs that were shown on the controller that did NOT require service.
  • Page 115: Replacing The Failed Ecb

    HSZ50> RUN C_SWAP Replacing the failed ECB When the controller prompts you, answer the question: Do you wish to remove the other HSZ50 y/n [n] ? 2. Enter “Y” for YES. 3. Answer the question: Will its cache module also be removed Y/N [n] ? 4.
  • Page 116 ECB. See Figure 2–4. Until you are ready to install the SBB containing the new ECB in the cabinet, you can put the SBB containing the new ECB anywhere the cable will reach. Service Manual HSZ50 Array Controller...
  • Page 117: Reinstalling The Modules

    3. Answer the question: ***Sequence to INSERT the other HSZ50 has begun.*** Do you wish to INSERT the other HSZ50 [N] ? 4. Enter Y for YES. 5. Wait for the following text to appear on the operating controller’s console: Attempting to quiesce all ports.
  • Page 118: Restarting The Subsystem

    Port 5 restarted. Port 6 restarted. Controller Warm Swap terminated. The configuration has two controllers. To restart the other HSZ50: 1) Enter the command RESTART OTHER_CONTROLLER. 2) Press and hold in the Reset (//) button while inserting the program card.
  • Page 119: Preparing To Replace The Second Ecb

    HSZ50> RUN C_SWAP Replacing the second ECB When the controller prompts you, answer the question: Do you wish to remove the other HSZ50 y/n [n] ? 2. Enter “Y” for YES. 3. Answer the question: Will its cache module also be removed Y/N [n] ? 4.
  • Page 120 You may remove the cache module before or after port activity has restarted. __________________________________________ 10. Pull the cache module out of the shelf far enough to disconnect it from the backplane. It is not necessary to remove the cache module completely from the shelf. Service Manual HSZ50 Array Controller...
  • Page 121: Reinstalling The Modules

    3. Answer the question: ***Sequence to INSERT the other HSZ50 has begun.*** Do you wish to INSERT the other HSZ50 [N] ? 4. Enter “Y” for YES. 5. Wait for the following text to appear on the operating controller’s console: Attempting to quiesce all ports.
  • Page 122: Restarting The Subsystem

    Port 5 restarted. Port 6 restarted. Controller Warm Swap terminated. The configuration has two controllers. To restart the other HSZ50. 1) Enter the command RESTART OTHER_CONTROLLER. 2) Press and hold in the Reset (//) button while inserting the program card.
  • Page 123: Replacing Ecbs Using The Off-Line Method

    Entering the following command: HSZ50> SET NOFAILOVER 6. Place the controllers into dual-redundant mode: HSZ50> SET FAILOVER COPY=OTHER_CONTROLLER Controller B will restart. 7. Ensure that the ECB cable connections are secure. 8. Remove the old ECB SBB from the device shelf and replace it with the new operating SBB.
  • Page 124: Table 2-4 Ecb Status Indicators

    System power is off and the ECB is not supplying power to the cache. If the battery status is low, you may want to set the cache policy. Refer to the procedure documented in the HSZ50 Array Controller HSOF 5.1 CLI Reference Manual. Service Manual...
  • Page 125: Replacing Power Supplies

    Connect a maintenance terminal to one of the controllers. Since you are in dual-redundant mode, enter the following command from the CLI of one controller: HSZ50> SHUTDOWN OTHER_CONTROLLER 3. From the CLI on the same controller, enter: HSZ50> SHUTDOWN THIS_CONTROLLER To ensure the controller has shut down cleanly, check for the following indications on the controller’s operator control panel...
  • Page 126: Installing The New Power Supply

    Figure 2–9 Removing the power supply CXO-5228A-MC Installing the new power supply Firmly push the power supply into the shelf until the mounting tabs snap into place. 2. Reconnect the power cord to the power supply. Service Manual HSZ50 Array Controller...
  • Page 127: Asynchronous Swap Method

    1. Remove the failed power supply using steps 4, 5, and 6 of the cold-swap method. 2. Replace a new power supply using the same procedure you used for replacing the power supply with the cold-swap method. HSZ50 Array Controller Service Manual...
  • Page 128: Replacing Storage Devices

    1. Make sure the device is not an active device in any storageset. 2. Do not remove any device unless a knowledgeable person approves of the removal. 3. Determine the disk name (DISK100, DISK200, and so forth). 4. Enter the following command: HSZ50>SHOW DISK_NAME Service Manual HSZ50 Array Controller...
  • Page 129: Figure 2-11 Removing A Disk Drive

    13. Observe the status LED for the following indications. See Figure 2–12. – The device activity (green) LED is either on, flashing, or off. – The device fault (amber) LED is off. HSZ50 Array Controller Service Manual...
  • Page 130: Replacing Tape Drives

    (Amber) CXO-4654B-MC 14. If you replaced a single disk drive or a disk from a stripeset, follow the procedure described in HSZ50 Array Controller HSOF 5.1 Configuration Manual to initialize the device. Replacing tape drives Use the warm-swap method to replace tape drives. When you use this method the OCP (operator control panel) buttons are used to quiesce the bus that corresponds to the replacement device.
  • Page 131: Replacing Solid-State Disk And Cd-Rom Drives

    2. Connect a maintenance terminal to one of the controllers. 3. At the CLI prompt, enter: HSZ50> SHUTDOWN OTHER_CONTROLLER HSZ50> SHUTDOWN THIS_CONTROLLER 4. Remove the power cords from the shelf that contains the failed solid-state disk drive. If the device is in an SW300 cabinet, you must power down the whole cabinet.
  • Page 132: Figure 2-14 Removing The Cd-Rom Drive

    8. Reconnect the power cords to the shelf power supply or power up the SW300 cabinet. 9. Observe the status LED for the following indication: – The device fault (amber) LED is off. Service Manual HSZ50 Array Controller...
  • Page 133: Replacing Scsi Host Cables

    3. Disconnect the failed SCSI host cable from the host or other device. 4. Shut down the controller/controllers. 5. Loosen the captive screws on the trilink connector at the controller’s front bezel. Disconnect the cable from the trilink connector. See Figure 2–15. HSZ50 Array Controller Service Manual...
  • Page 134: Figure 2-15 Disconnecting The Scsi Host Cable

    Tighten the captive screws on the SCSI host cable connector. 10. Connect the other end of the host cable to the appropriate device on the bus. 11. Restart the controller/controllers. Service Manual HSZ50 Array Controller...
  • Page 135: Replacing Scsi Device Port Cables

    “Replacing a Controller and Cache Module in a Single Controller Configuration” in this chapter. 3. Loosen the two captive screws on each side of the volume shield and remove the shield. See Figure 2–16. HSZ50 Array Controller Service Manual...
  • Page 136: Figure 2-16 Removing The Volume Shield

    4. Remove the failed cable from the controller shelf backplane by pinching the cable connector side clips and disconnecting the cable. __________________Caution_________________ Digital recommends that you label all devices before you remove them from the device shelf. Note the PTL for each device. __________________________________________ Service Manual HSZ50 Array Controller...
  • Page 137: Figure 2-17 Access To The Scsi Cables

    6. Remove any SBBs necessary to gain access the SCSI cable. See Figure 2–17. Figure 2–17 Access to the SCSI cables 8-bit shelf SCSI cable Remove access SBBs Bus connector Bus connector Remove device cable CXO-5176A-MC HSZ50 Array Controller Service Manual...
  • Page 138 11. Replace the volume shield in the controller shelf and lightly tighten the captive screws using a flat-head screwdriver. 12. Replace the cache modules and the controller modules following the same procedure you used to replace these modules in a single controller configuration. Service Manual HSZ50 Array Controller...
  • Page 139: Installing And Upgrading

    Installing new firmware on a device Installing a controller and cache module (single controller configuration) Installing a second controller and cache module Installing a cache module Adding cache memory Installing power supplies Installing storage building blocks HSZ50 Array Controller Service Manual...
  • Page 140 Installing and Upgrading Introduction This chapter describes various installation and upgrade procedures you will perform while servicing the HSZ50 subsystem. As you perform these procedures, refer to Chapter 2, “Replacing Field Replaceable Units”, for important precaution information and required tools.
  • Page 141: Program Card Upgrade (Single Controller Configuration)

    1. Halt all I/O activity to the controller using the appropriate procedures for your operating system. 2. Connect a maintenance terminal to the controller. Take the controller out of service: HSZ50> SHUTDOWN THIS_CONTROLLER To ensure the controller has shutdown cleanly, check for the following indications on the controller’s OCP: –...
  • Page 142: Program Card Upgrade (Dual-Redundant Configuration)

    When the controllers initialize correctly, the green Reset (//) LED will flash once every second. Replace the ESD covers over both program cards. Service Manual HSZ50 Array Controller...
  • Page 143: Upgrading Controller Software Using The Clcp Utility

    Invoking the CLCP utility To invoke the CLCP utility enter the following command at the CLI prompt: HSZ50> RUN CLCP The CLCP utility menu is displayed: Select an option from the following list: Code Load & Code Patch local program Main Menu...
  • Page 144: Single Controller Upgrade Method

    45 minutes (for a download performed via the maintenance terminal port). The only time the code load process interrupts device service is for a period of about 4 minutes, while the program card is written and the controller initializes with the new software. Service Manual HSZ50 Array Controller...
  • Page 145: Host Port Upgrade

    The user invokes the CLCP utility via the CLI, and when prompted, instructs the host to download the binary software image to the controller using the download script. The controller rewrites the software in its program card using the downloaded software image. HSZ50 Array Controller Service Manual...
  • Page 146: Host Download Script Requirements

    ______________________Note _____________________ Upgrade instructions for your system may vary, depending upon the platform, operating system, and application environment of your external processor. The instructions presented in this document are provided as a general guide. ________________________________________________ Service Manual HSZ50 Array Controller...
  • Page 147: Setting Up The Host

    1. Locate the program card on the controller module. 2. Locate the write-protect switch on the outer edge of the card. 3. With a small pointed object, carefully slide the switch lever away from the eject button. See Figure 3–3. HSZ50 Array Controller Service Manual...
  • Page 148: Running The Clcp Utility

    ENABLED PROTECTED CXO-4825A-MC Running the CLCP utility 1. Invoke the CLCP utility: HSZ50> RUN CLCP Select an option from the following list: Code Load & Code Patch local program Main Menu 0: Exit 1: Enter Code LOAD local program 2: Enter Code PATCH local program Enter option number (0..2) [0] ? 1...
  • Page 149 No user action is required. Ignore the “Last fail code” reported. The failcode is the indication the controller has restarted because of a successful code load operation. HSZ50 Array Controller Service Manual...
  • Page 150 3–12 Installing and Upgrading Copyright Digital Equipment Corporation 1993, 1997. All rights reserved. HSZ50 Firmware version V5.1, Hardware version AXYY Last fail code: 86000020 Press " ?" at any time for help. The CLI will take 60 seconds to initialize.
  • Page 151: Maintenance Terminal Port Upgrade

    Figure 3–4 Terminal port code load operation TERMINAL CODE LOAD/ COMM EIA-423 EMULATION CODE PATCH PORT n PORT PROGRAM PROGRAM PCMCIA FIRMWARE FIRMWARE .IMG FILE CARD EXTERNAL CONTROLLER PROCESSOR CXO-4600A-MC HSZ50 Array Controller Service Manual...
  • Page 152: System Setup

    Set the connector location to the serial communications port you are using on your external processor. When the terminal emulator is configured, close the menu window. 7. Configure your terminal as follows: – Baud Rate 19200 – Data Bits Service Manual HSZ50 Array Controller...
  • Page 153: Figure 3-5 Binary Transfer Protocol Selection

    Xon/Xoff 8. Press the Enter key to obtain a CLI prompt. The controller should respond with a prompt such as “HSZ50”. If it does not respond, check your communications connection and terminal emulator configuration. Make sure the emulator and CLI communications settings match.
  • Page 154: Write Enable The Program Card In The Controller

    3. With a small pointed object, carefully slide the switch lever away from the eject button. Running the CLCP utility Invoke the CLCP utility HSZ50> RUN CLCP The CLCP main menu is displayed: Select an option from the following list: Code Load & Code Patch local program Main Menu...
  • Page 155 Enter “Y” and press the Return key to continue with the code load operation. The program prompts you with “Start KERMIT now..”. 6. Open the Transfers menu on the terminal emulator menu bar and select the Send Binary File option. The Send Binary File menu is displayed. HSZ50 Array Controller Service Manual...
  • Page 156 When the green RESET button begins flashing about once each second, the card rewrite operation is complete. No user interaction is required to restart the controller with the newly- installed software. Service Manual HSZ50 Array Controller...
  • Page 157: The Dual-Redundant, Sequential Upgrade Method

    You must invoke CLCP separately for each controller in a dual- redundant configuration. CLCP does not automatically load both controllers. • To avoid extended downtime, always upgrade both controllers when you perform a software upgrade. HSZ50 Array Controller Service Manual...
  • Page 158: Figure 3-6 The Sequential Upgrade Method

    AUTO BOOT WITH NEW (2 MINS) FIRMWARE CLI>RESTART OTHER_ DEVICES RUNS WITH WHOLE CONTROLLER MANUAL PREFER BACK (2 MINS) DEVICE LOAD BOOT DEVICES RUNS NORMALLY RUNS NORMALLY SERVICED SHARES DEVICE LOAD SHARES DEVICE LOAD CXO-4926A-MC Service Manual HSZ50 Array Controller...
  • Page 159: Sequential Upgrade Procedure

    3. Connect a maintenance terminal to controller A. 4. At the CLI prompt, enter: HSZ50> SHUTDOWN THIS_CONTROLLER 5. Move the maintenance terminal to controller B. 6. If you wish to use the host port to load your software, perform the single controller host port upgrade procedure.
  • Page 160: Considerations For The Concurrent Code Load Upgrade Method

    Always upgrade both of your controllers when you do a software upgrade. Do not run your controllers at different revision levels, except for the short amount of time this may happen during the upgrade process. Service Manual HSZ50 Array Controller...
  • Page 161: Figure 3-7 The Concurrent Upgrade Method

    AUTO SHUTDOWN AUTO WRITE RUNS WITH WHOLE (2 MINS) PCMCIA DEVICE LOAD LINE CARD DEVICES PREFER BACK AUTO BOOT WITH NEW (2 MINS) FIRMWARE RUNS NORMALLY RUNS NORMALLY SHARES DEVICE LOAD SHARES DEVICE LOAD CXO-4901A-MC HSZ50 Array Controller Service Manual...
  • Page 162: Concurrent Code Load Upgrade Procedure

    6. At the CLI prompt, enter: HSZ50> SHOW THIS_CONTROLLER 7. The controller displays the following information (this is a sample only): Controller: HSZ50 ZG34901786 Firmware V05.1-0, Hardware F01 Configured for dual-redundancy with ZG51301100 In dual-redundant configuration SCSI address 7 Time: 05 FEB-1997 16:32:54...
  • Page 163: Patching Controller Software

    9. In order to upgrade the software in both controllers from the host port, at least one target must be preferred to each controller. At the CLI prompt, enter: HSZ50> SET OTHER_CONTROLLER PREFERRED_ID=0 10. Both controllers are now configured for software upgrade using the host port method.
  • Page 164: Code Patch Considerations

    Following is an example of the List Patches option and its output: Connect a maintenance terminal to the controller. Invoke the CLCP utility: HSZ50> RUN CLCP The CLCP main menu is displayed: Select an option from the following list: Code Load & Code Patch local program Main Menu...
  • Page 165 “dash number” following the software version. In the following example, software Version 5.1 has had up to three patches applied to the current software. 5. At the CLI prompt, enter: HSZ50> SHOW THIS_CONTROLLER Controller: HSZ50 ZG33400026 Firmware V51Z-3, Hardware 0000...
  • Page 166: Installing A Patch

    Following is an example of the use of the patch entry option: 1. Obtain the appropriate patch data for your controller's software version from your Digital Equipment Corporation representative. 2. Connect a maintenance terminal to the controller. 3. At the CLI prompt, enter: HSZ50>...
  • Page 167 Type ^Y or ^C (then RETURN) at any time to abort Code Patch. Do you wish to continue (y/n) [y] ? 6. Enter “Y” to continue. HSZ50 Array Controller Service Manual...
  • Page 168: Code Patch Messages

    Message: Firmware Version x does not have any patches to delete. Explanation: You cannot delete a patch because the software (firmware) version entered does not have any patches entered. Service Manual HSZ50 Array Controller...
  • Page 169 You may enter Ctrl/Z followed by Return at any prompt to choose the default for the remaining entries. HSZ50 Array Controller Service Manual...
  • Page 170: Formatting Disk Drives

    Suspend all I/O to the buses that service the target disk drives. ________________________________________________ To format one or more disk drives: 1. Start HSUTIL. HSZ50> RUN HSUTIL 2. Enter 1 to select the function. FORMAT HSUTIL finds and displays all of the unformatted disk drives attached to the controller.
  • Page 171: Considerations For Formatting Disk Drives

    Do not invoke any CLI command or run any local program that might reference the target disk drive while HSUTIL is active. Also, do not reinitialize either controller in the dual-redundant configuration. Example HSZ50> RUN HSUTIL *** Available functions are: 0. EXIT 1. FORMAT 2.
  • Page 172 Do you want to continue (y/n) [n] ? Y HSUTIL started at: 14-AUG-1996 15:00:31 Format of DISK100 finished at 14-FEB-1997 16:40:12 Format of DISK200 finished at 14-FEB-1997 17:15:31 Format of DISK210 finished at 14-FEB-1997 16:30:43 HSUTIL - Normal Termination at 14-FEB-1997 16:31:09 Service Manual HSZ50 Array Controller...
  • Page 173: Installing New Firmware On A Device

    Figure 3–8. First, copy the new firmware from your host to a disk drive in your subsystem, then use HSUTIL to distribute the firmware devices in your subsystem. Figure 3–8 Installing new firmware on a disk or tape drive CXO-5259A-MC HSZ50 Array Controller Service Manual...
  • Page 174: Considerations For Installing New Device Firmware

    • Some devices may not reflect the new firmware version number, and so forth,. when viewed from another controller (in dual-redundant configurations). If you experience this, simply reinitialize the device from either controller. Service Manual HSZ50 Array Controller...
  • Page 175: Hsutil Abort Codes

    Message: Unable to change operation mode to maintenance for unit unit_number Explanation: HSUTIL was unable to put the source single disk drive unit into maintenance mode to enable formatting or code load. HSZ50 Array Controller Service Manual...
  • Page 176 SET THIS PREFERRED ID=(unit’s target ID). SET OTHER NOPREFERRED_ID. Explanation: The device shown is still under the control of the companion controller. Follow the recommended steps to run HSUTIL. Service Manual HSZ50 Array Controller...
  • Page 177 Explanation: The RUN\NORUN unit indicator for the unit shown is set to NORUN. The disk is not spun up. Message: No available unattached devices. Explanation: The program could find no unattached devices to list. HSZ50 Array Controller Service Manual...
  • Page 178 This message is displayed if HSUTIL detects that an unsupported device has been selected as the target device. You must indicate whether to download the firmware image to the device in one or more contiguous blocks, each corresponding to one SCSI Write Buffer command. Service Manual HSZ50 Array Controller...
  • Page 179: Installing A Controller And Cache Module In A Single Controller Configuration

    Connect a maintenance terminal to the controller. 3. Install an external cache battery SBB into a convenient device slot. See Figure 3–9. 4. Install the controller power supplies into the controller shelf. See Figure 3–10. HSZ50 Array Controller Service Manual...
  • Page 180: Figure 3-9 Installing An Sbb Battery Module

    3–42 Installing and Upgrading Figure 3–9 Installing an SBB battery module CXO-5306A-MC Figure 3–10 Installing controller power supplies CXO-5304A-MC Service Manual HSZ50 Array Controller...
  • Page 181: Figure 3-11 Installing A Single Controller (Sw800 Cabinet)

    8. Tighten the screws on each end of the ECB cable. 9. While pushing and holding down the operator control panel (OCP) Reset (//) button on the controller, eject and remove the program card. 10. Connect the power cords to the controller power supplies. HSZ50 Array Controller Service Manual...
  • Page 182: Table 3-3 Ecb Status Indicators

    System power is off and the ECB is not supplying power to the cache. If the battery status is low, you may want to set the cache policy. Refer to the procedure documented in the HSZ50 Array Controller HSOF 5.1 CLI Reference Manual. Service Manual...
  • Page 183: Installing A Second Controller And Cache Module

    2. At the existing controller’s terminal, enter: HSZ50> SHOW THIS_CONTROLLER The controller displays the following information (this is a sample only): Controller: HSZ50-AX ZG34901786 Firmware V51Z, Hardware AX11 Not configured for dual-redundancy SCSI address 7 Time: 04 FEB-1997 16:32:54 Host port:...
  • Page 184 3–46 Installing and Upgrading At the CLI prompt, enter: HSZ50> SHUTDOWN THIS_CONTROLLER When you enter the command, do not specify any SHUTDOWN optional qualifiers. The default qualifiers do not allow the controller to shut down until data is completely and successfully stored on the appropriate storage devices.
  • Page 185 HSZ50> SHOW THIS_CONTROLLER 22. If there are any invalid cache errors, enter the following command to clear the errors: HSZ50> CLEAR INVALID_CACHE THIS_CONTROLLER NODESTROY_UNFLUSHED_DATA 23. Set the new controller to nofailover with the following command: HSZ50> SET NOFAILOVER 24.
  • Page 186: Installing A Write-Back Cache Module

    2. Halt all host I/O activity using the appropriate procedure for your operating system. 3. Take the controller out of service: HSZ50> SHUTDOWN THIS_CONTROLLER To ensure the controller has shutdown cleanly, check for the following indications on the controller’s OCP: –...
  • Page 187: Installing The Write-Back Cache Module

    10. When the Reset (//) LED on the controller flashes at a rate of once every second, the initialization process is complete. 11. Snap the ESD covers into place over the program card. Push the pins inward to lock the covers in place. HSZ50 Array Controller Service Manual...
  • Page 188: Adding Cache Memory

    Connect a maintenance terminal to the controller. 2. Take the single controller out of service: HSZ50> SHUTDOWN THIS_CONTROLLER 3. If you are working with a dual redundant configuration, take both controllers out of service: HSZ50> SHUTDOWN OTHER_CONTROLLER HSZ50>...
  • Page 189: Table 3-4 Adding Cache Memory Capacity

    13. Reinstall the controller modules into their original slots. Use a gentle rocking motion to help seat the module. If you are using a single controller configuration, use the slot that is designated SCSI ID 7. 14. Reconnect the ECB cable to the cache module. HSZ50 Array Controller Service Manual...
  • Page 190 22. To check cache capacity of the cache modules, attach a maintenance terminal to one of the controllers. At the CLI prompt type: HSZ50> SHOW THIS_CONTROLLER The controller will report the following information: Controller: HSZ50-AX ZG34901786 Firmware V51z, Hardware AX11 Configured for dual-redundancy with ZG51301100 In dual-redundant configuration SCSI address 7...
  • Page 191: Installing Power Supplies

    CONTROLLER 24. Enable the new write-back cache on specific units by issuing the following command. HSZ50> SET unit name WRITEBACK_CACHE Installing power supplies This section describes how to install a power supply into a SBB shelf or into a controller shelf.
  • Page 192: Table 3-6 Shelf And Single Power Supply Status Indicators

    Described in the Replace Section. LED on = LED off = ______________________Note _____________________ The status indicators will operate ONLY if the power supplies and the shelf blowers are present. The failure must be an electrical or mechanical failure. ________________________________________________ Service Manual HSZ50 Array Controller...
  • Page 193: Table 3-7 Shelf And Dual Power Supply Status Indicators

    Replace PS 2. Shelf LED PS 2 is operational. Power supply LED Replace PS 1. Shelf LED Possible PS 1 and PS 2 fault or input power problem. Power supply LED LED on = LED off = HSZ50 Array Controller Service Manual...
  • Page 194: Power Supply Installation Procedure

    If the status indicators are not on, refer to the Status indicator tables and take appropriate service action. 4. Repeat the above steps to add a second power supply for redundancy. After connecting the power cord, observe the status indicators and ensure that they are both on. Service Manual HSZ50 Array Controller...
  • Page 195: Installing Storage Building Blocks

    Installing and Upgrading 3–57 Installing storage building blocks The storage device building blocks (SBBs) are 3 1/2 inch or 5 1/4 inch form factors. The HSZ50 controller supports the following devices: • 3.5-inch and 5.25-inch disk drives • CD ROM drives in 5 1/4 inch StorageWorks building blocks •...
  • Page 196: Sbb Activity And Fault Indicators

    The upper LED (green) is the device activity indicator and is on or flashing when the SBB is active. The lower LED (amber) is the device fault indicator and indicates an error condition or a configuration problem when it is on or flashing. See Table 3–8. Service Manual HSZ50 Array Controller...
  • Page 197: Table 3-8 Storage Sbb Status Indicators

    Fault status SBB is active and is spinning down because Device fault of a fault. Device activity Fault status SBB has been identified by the controller as failed. Device fault Replace the SBB. LED on = LED off = LED flashing = HSZ50 Array Controller Service Manual...
  • Page 198: Installing Sbbs (Except Solid State Disk And Cd-Rom)

    The lower LED of each configured device will flash at a rate of once every second. To turn off the lower LED use the command. LOCATE CANCEL Refer to the HSZ50 Array Controller HSOF 5.1 CLI Reference Manual for further details of the command. LOCATE...
  • Page 199 3–61 2. Connect a maintenance terminal to one of the controllers. 3. At the CLI prompt, enter: HSZ50> SHUTDOWN OTHER_CONTROLLER HSZ50> SHUTDOWN THIS_CONTROLLER To ensure that the controller has shut down cleanly, check for the following indications on the controller’s operator control panel (OCP): –...
  • Page 201: Moving Storagesets And Devices

    Moving storagesets and devices Moving storagesets Moving storageset members Moving single disk-drive units Moving devices HSZ50 Array Controller Service Manual...
  • Page 202: Precautions For Retaining Data

    Wait until the CLI prompt appears on your local or remote terminal before inserting or removing any device. • Wait about one minute after inserting each device before you insert another. • Do not insert or remove a device during failover or failback. Service Manual HSZ50 Array Controller...
  • Page 203: Moving Storagesets

    1. Show the details for the storageset you want to move: HSZ50> SHOW storageset-name 2. Label each member with its name and PTL location. (If you do not have a storageset map for your subsystem, you can use utility to find each member’s PTL location.):...
  • Page 204 6. Remove the disk drives and move them to their new PTL locations. 7. Add again each disk drive to the controller’s list of valid devices. HSZ50> ADD DISK disk-name PTL-location HSZ50> ADD DISK disk-name PTL-location HSZ50> ADD DISK disk-name PTL-location 8.
  • Page 205 (...move disk drives to their new location...) HSZ50> ADD DISK DISK100 1 0 0 HSZ50> ADD DISK DISK300 3 0 0 HSZ50> ADD DISK DISK400 4 0 0 HSZ50> ADD RAIDSET R3 DISK100 DISK300 DISK400 REDUCED HSZ50> ADD UNIT D100 R3 HSZ50 Array Controller Service Manual...
  • Page 206: Moving Storageset Members

    ________________________________________________ 1. Delete the unit-number of the storageset that contains the disk drive you want to move: HSZ50> DELETE unit-number 2. Delete the storageset that contains the disk drive you want to move: HSZ50> DELETE storageset-name 3. Delete each disk drive—one at a time—that was contained by the storageset: HSZ50>...
  • Page 207 RAIDset “RAID99” that comprises members 200, 210, and 400.) HSZ50> DELETE D100 HSZ50> DELETE RAID99 HSZ50> DELETE DISK210 (...move disk210 to PTL location 300...) HSZ50> ADD DISK DISK300 3 0 0 HSZ50> ADD RAIDSET RAID99 DISK200 DISK300 DISK400 HSZ50> ADD UNIT D100 RAID99 HSZ50 Array Controller Service Manual...
  • Page 208: Moving A Single Disk-Drive Unit

    The following example moves D507 to PTL location 100. (Its new name will be DISK100 to correspond to its new PTL location.) HSZ50> Show D507 HSZ50> Delete D507 HSZ50> Delete Disk100 HSZ50> Add Disk100 1 0 0 HSZ50> Add D507 Disk100 Service Manual HSZ50 Array Controller...
  • Page 209: Moving A Tape Drive, Cd-Rom Drive, Or Tape Loader

    5. Remove the device and move it to its new PTL location: 6. Add again the device to the controller’s list of valid devices.: HSZ50> ADD DEVICE device-name PTL-location 7. If you are moving a tape loader, recreate the passthrough device that represents the loader: HSZ50>...
  • Page 210 T108 HSZ50> DELETE T108 HSZ50> DELETE TAPE100 (...move tape100 to its new location...) HSZ50> ADD TAPE TAPE600 6 0 0 HSZ50> ADD UNIT T600 TAPE600 The following example moves tape LOADER120 from p3 to p1: HSZ50> SHOW PASSTHROUGH LOADER NAME...
  • Page 211: Removing

    Removing Removing a patch Removing a controller and cache module Removing storage devices HSZ50 Array Controller Service Manual...
  • Page 212 To remove a patch: 1. Connect a maintenance terminal to one of the controllers. 2. Start the CLCP utility: HSZ50> RUN CLCP The CLCP main menu is displayed. Select an option from the following list: Code Load & Code Patch Utility Main Menu...
  • Page 213 Do you wish to continue (y/n) [y] 7. Enter Y to continue. The patch you have just deleted is currently applied, but will not be applied when the controller is restarted. Code Patch Main Men 0: Exit 1: Enter a Patch HSZ50 Array Controller Service Manual...
  • Page 214 2: Delete Patches 3: List Patches Enter option number (0..3) [0] The following patches are currently stored in the patch area: Firmware Version - Patch number(s) V123 1, 2 Currently, 95% of the patch area is free. Service Manual HSZ50 Array Controller...
  • Page 215 5. Loosen the captive screws on the controller’s front bezel and slide the controller out of the shelf. Loosen the captive screws on the cache module’s front bezel and slide the cache module out of the shelf. Remove the ECB from its slot. HSZ50 Array Controller Service Manual...
  • Page 216: Removing Disk Drives

    Use the following procedure to remove 3 1/2 - inch and 5 1/4 - inch disk drives 1. Show the details for the unit you want to move: HSZ50> SHOW unit-number 2. Delete the unit-number shown in the “Used by” column of the SHOW unit-number command: HSZ50>...
  • Page 217: Removing Solid State Disks And Cd-Rom Drives

    Halt all host I/O activity using the appropriate procedures for your operating system. 3. Take the controller out of service: HSZ50> SHUTDOWN THIS_CONTROLLER 4. If you are working in a dual-redundant configuration take both controllers out of service: HSZ50> SHUTDOWN OTHER_CONTROLLER HSZ50>...
  • Page 218: Removing Tape Drives

    CXO-4824A-MC When the port has quiesced, remove the tape drive by pressing the two mounting tabs together to release it from the shelf. Using both hands, pull the tape drive out of the device shelf. Service Manual HSZ50 Array Controller...
  • Page 219 Appendix A Instance, codes Last failure codes Repair action codes HSZ50 Array Controller Service Manual...
  • Page 220: Instance, Codes And Definitions

    The CACHEA0 DRAB unexpectedly reported a Cache Time-out condition. 012B3702 The CACHEA1 DRAB unexpectedly reported a Cache Time-out condition. 012C3702 The CACHEB0 DRAB unexpectedly reported a Cache Time-out condition. 012D3702 The CACHEB1 DRAB unexpectedly reported a Cache Time-out condition. Service Manual HSZ50 Array Controller...
  • Page 221 The Master DRAB detected a Multiple Bit ECC error during a host port attempt to read buffer memory. 013B2802 The Master DRAB detected a Multiple Bit ECC error during a Device port attempt to read buffer memory. HSZ50 Array Controller Service Manual...
  • Page 222 The CACHEB1 DRAB detected a Multiple Bit ECC error during an FX attempt to read CACHEB1 memory. 014A2A02 The CACHEB1 DRAB detected a Multiple Bit ECC error during a host port attempt to read CACHEB1 memory. Service Manual HSZ50 Array Controller...
  • Page 223 The Master DRAB detected a Nonexistent Memory Error condition during a host port attempt to write buffer memory. 015B2C02 The Master DRAB detected a Nonexistent Memory Error condition during a Host port attempt to write a byte to buffer memoryh HSZ50 Array Controller Service Manual...
  • Page 224 The CACHEA0 DRAB detected a Nonexistent Memory Error condition during a host port attempt to read CACHEA0 memory. 01692D02 The CACHEA0 DRAB detected a Nonexistent Memory Error condition during a Device port attempt to write CACHEA0 memory. Service Manual HSZ50 Array Controller...
  • Page 225 Nonexistent Memory Error condition during a Device port attempt to write CACHEA1 memory. 01762D02 The CACHEA1 DRAB detected a Nonexistent Memory Error condition during a Device port attempt to write a byte to CACHEA1 memory. HSZ50 Array Controller Service Manual...
  • Page 226 The CACHEB0 DRAB detected a Nonexistent Memory Error condition during a Device port attempt to read CACHEB0 memory. 01842E02 The CACHEB0 DRAB detected a Nonexistent Memory Error condition during an I960 attempt to write CACHEB0 memory. Service Manual HSZ50 Array Controller...
  • Page 227 CACHEB1 memory. 01922E02 The CACHEB1 DRAB detected a Nonexistent Memory Error condition during an I960 attempt to read CACHEB1 memory. 01933702 The Master DRAB unexpectedly reported a Nonexistent Memory Error condition. HSZ50 Array Controller Service Manual...
  • Page 228 The CACHEA0 DRAB detected an Address Parity error during a Host port attempt to read CACHEA0 memory. 01A33002 The CACHEA0 DRAB detected an Address Parity error during a Device port attempt to read CACHEA0 memory. Service Manual HSZ50 Array Controller...
  • Page 229 Parity error during an I960 attempt to read CACHEB1 memory. 01B13702 The Master DRAB unexpectedly reported an Address Parity error. 01B23702 The CACHEA0 DRAB unexpectedly reported an Address Parity error. 01B33702 The CACHEA1 DRAB unexpectedly reported an Address Parity error. HSZ50 Array Controller Service Manual...
  • Page 230 The Master DRAB detected a Write Data Parity error during an FX attempt to write buffer memory. 01C32F02 The Master DRAB detected a Write Data Parity error during an FX attempt to write a byte to buffer memory. Service Manual HSZ50 Array Controller...
  • Page 231 The CACHEA0 DRAB detected a Write Data Parity error during an I960 attempt to write a byte to CACHEA0 memory. 01D23002 The CACHEA1 DRAB detected a Write Data Parity error during an FX attempt to write CACHEA1 memory. HSZ50 Array Controller Service Manual...
  • Page 232 The CACHEB0 DRAB detected a Write Data Parity error during an I960 attempt to write CACHEB0 memory. 01E13102 The CACHEB0 DRAB detected a Write Data Parity error during an I960 attempt to write a byte to CACHEB0 memory. Service Manual HSZ50 Array Controller...
  • Page 233 02032001 Journal SRAM backup battery failure; detected during system restart. The Memory Address field contains the starting physical address of the Journal SRAM. HSZ50 Array Controller Service Manual...
  • Page 234 The dirty data is lost. The Memory Address field contains the starting physical address of the CACHEA0 memory. 020C2201 cache diagnostics have declared the cache bad during testing. The Memory Address field contains the starting physical address of the CACHEA0 memory. Service Manual HSZ50 Array Controller...
  • Page 235 The unit has been marked inoperative or UNKNOWN. In either case, the unit is not available. 02150064 The Unit State Block unit status, associated with this I/O has changed to the UNKNOWN state. Therefore, the I/O was aborted. HSZ50 Array Controller Service Manual...
  • Page 236 Byte Count, DRAB register, and Diagnostic register fields are undefined. 021E0064 The device specified in the Device Locator field has been added to the RAIDset associated with the logical unit. The RAIDset is now in Reconstructing state. Service Manual HSZ50 Array Controller...
  • Page 237 Copying state. 02290064 The device specified in the Device Locator field has been removed from the mirrorset associated with the logical unit. The removed device is now in the Failedset. HSZ50 Array Controller Service Manual...
  • Page 238 023C0064 The device specified in the Device Locator field had a read error. Attempts to repair the error with data from another mirrorset member failed bacause of a lack of alternate error-free data source. Service Manual HSZ50 Array Controller...
  • Page 239 Position Error on a tape unit, the recovery failed to start because resources required for the recovery were not available. 02480064 When attempting to recover a Write Append Position Error on a tape unit, an error occurred during the recovery. HSZ50 Array Controller Service Manual...
  • Page 240 Device Sense Data contains the block number of the first block in error. 0255000A The controller was unable to successfully transfer data to target unit. 0256000A The write operation failed because the unit is data safety write protected. Service Manual HSZ50 Array Controller...
  • Page 241 Associated Additional Sense Code Qualifier fields are undefined. 03022002 A SCSI interface chip command time-out occurred during disk operation. Note that in this instance, the Associated Additional Sense Code and Associated Additional Sense Code Qualifier fields are undefined. HSZ50 Array Controller Service Manual...
  • Page 242 Unrecovered Read or Write error. 03104002 No response from one or more drives. 0311430A Nonvolatile memory and drive metadata indicate conflicting drive configurations. 0312430A The Synchronous Transfer Value differs between drives in the same storageset. Service Manual HSZ50 Array Controller...
  • Page 243 SCSI bus selection time-out. 03330002 Device power on reset. 03344002 Target assertion of REQ after WAIT DISCONNECT. 03354002 During device initialization a Test Unit Ready command or a Read Capacity command to the drive failed. HSZ50 Array Controller Service Manual...
  • Page 244 Unrecovered Read or Write error. 036B4002 No response from one or more drives. 036C430A Nonvolatile memory and drive metadata indicate conflicting drive configurations. 036D430A The Synchronous Transfer Value differs between drives in the same storageset. Service Manual HSZ50 Array Controller...
  • Page 245 Code and Associated Additional Sense Code Qualifier fields are undefined. 03844002 Byte transfer time-out during tape operation. Note that in this instance, the Associated Additional Sense Code and Associated Additional Sense Code Qualifier fields are undefined. HSZ50 Array Controller Service Manual...
  • Page 246 The drive was failed by a Mode Select command received from the host. 039B4002 The drive failed due to a deferred error reported by drive. 039C4002 Unrecovered Read or Write error. 039D4002 No response from one or more drives. Service Manual HSZ50 Array Controller...
  • Page 247 Code and Associated Additional Sense Code Qualifier fields are undefined. 03B52002 SCSI interface chip command time-out during media loader operation. Note that in this instance, the Associated Additional Sense Code and Associated Additional Sense Code Qualifier fields are undefined. HSZ50 Array Controller Service Manual...
  • Page 248 No command control structures available for operation to a device which is unknown to the controller. Note that in this instance, the Associated Additional Sense Code and Associated Additional Sense Code Qualifier fields are undefined. Service Manual HSZ50 Array Controller...
  • Page 249 Test Unit Ready or Read Capacity command to a device. The device type is unknown to the controller. Note that in this instance, the Associated Additional Sense Code and Associated Additional Sense Code Qualifier fields are undefined. HSZ50 Array Controller Service Manual...
  • Page 250 SCSI Sense Key HARDWARE ERROR. This indicates that the target detected a non-recoverable hardware failure (for example, controller failure, device failure, parity error, etc.) while performing the command or during a self test. Service Manual HSZ50 Array Controller...
  • Page 251 During device initialization, the device reported the SCSI Sense Key COPY ABORTED. This indicates a COPY, COMPARE, or COPY AND VERIFY command was aborted due to an error condition on the source device, the destination device, or both. HSZ50 Array Controller Service Manual...
  • Page 252 Additional Sense Code Qualifier fields are undefined. 03E90E02 The EMU has detected an external air sense fault Note that in this instance, the Associated Target, Associated Additional Sense Code, and Associated Additional Sense Code Qualifier fields are undefined. Service Manual HSZ50 Array Controller...
  • Page 253 03F20064 The SWAP interrupts have been cleared and re-enabled for all shelves. Note that in this instance, the Associated Port, Associated Target, Associated Additional Sense Code, and Associated Additional Sense Code Qualifier fields are undefined. HSZ50 Array Controller Service Manual...
  • Page 254 Failover Control received a Last Gasp message from the other controller. The other controller is expected to restart itself within a given time period. If it does not, it will be held reset with the Kill line. Service Manual HSZ50 Array Controller...
  • Page 255 Failed Controller Target Number and Other Controller Board Serial Number sense data fields, is again operational and that the controller reporting the event is willing to relinquish control of the units identified in the affected LUNs sense data field. HSZ50 Array Controller Service Manual...
  • Page 256 Test. This will cause the console to be unusable. This will cause failover communications to fail. 82072002 An unrecoverable error was detected during execution of the FX Subsystem Test. 82082002 An unrecoverable error was detected during execution of the nbuss init Test. Service Manual HSZ50 Array Controller...
  • Page 257: Last Fail Codes

    Last Failure Parameter[0] contains the PC value. Last Failure Parameter[1] contains the AC value. Last Failure Parameter[2] contains the fault type and subtype values. Last Failure Parameter[3] contains the address of the faulting instruction. 01070100 Timer chip setup failed. HSZ50 Array Controller Service Manual...
  • Page 258 (needs charging). 010C2380 A processor interrupt was generated by the CACHEB Dynamic Ram controller and ArBitration engine (DRAB) with an indication that the CACHE backup battery has failed or is low (needs charging). Service Manual HSZ50 Array Controller...
  • Page 259 Last Failure code. 01100100 Non-maskable interrupt entered but no Non- maskable interrupt pending. This is typically caused by an indirect call to address 0. HSZ50 Array Controller Service Manual...
  • Page 260 SIP last failure parameter value. Last Failure Parameter [4] contains the SIP last failure code value Last Failure Parameter [5] contains the EXEC, BUGCHECK call last failure code value. 018000A0 A powerfail interrupt occurred. Service Manual HSZ50 Array Controller...
  • Page 261: Table A-3 Value-Added Services Last Failure Codes

    Unable to allocate memory necessary for data buffers. 02050100 Unable to allocate memory for the Free Buffer Array. 02080100 A call to EXEC, ALLOCATE_MEM_ZEROED failed to return memory when populating the disk read DWD stack. HSZ50 Array Controller Service Manual...
  • Page 262 021E0100 Unable to allocate memory for the Free Strip Node Array. 021F0100 Unable to allocate memory for WARPs and RMDs. 02210100 Invalid parameters in CACHE, OFFER_META call. 02220100 No buffer found for CACHE, MARK_META_DIRTY call. Service Manual HSZ50 Array Controller...
  • Page 263 Unrecognized state supplied to FOC, SEND callback routine va_dap_snd_cmd_complete. Last Failure Parameter[0] contains the unrecognized value. 02370102 Unsupported return from HIS, GET_CONN_INFO routine Last Failure Parameter[0] contains the DD address. Last Failure Parameter[1] contains the invalid status. HSZ50 Array Controller Service Manual...
  • Page 264 Last Failure Parameter[0] contains the DD address. Last Failure Parameter[1] contains the invalid status. 02560102 An invalid status was returned from CACHE, LOOKUP_LOCK(). Last Failure Parameter[0] contains the DD address. Last Failure Parameter[1] contains the invalid status. Service Manual HSZ50 Array Controller...
  • Page 265 An invalid status was returned from CACHE, OFFER_WRITE_DATA(). Last Failure Parameter[0] contains the DD address. Last Failure Parameter[1] contains the invalid status. 02730100 A request was made to write a device metadata block with an invalid block type. HSZ50 Array Controller Service Manual...
  • Page 266 02880100 Invalid FOC Message in cmfoc_snd_cmd. 02890100 Invalid FOC Message in cmfoc_rcv_cmd. 028A0100 Invalid return status from DIAG, CACHE_MEMORY_TEST. 028B0100 Invalid return status from DIAG, CACHE_MEMORY_TEST. 028C0100 Invalid error status given to cache_fail. Service Manual HSZ50 Array Controller...
  • Page 267 02A00100 VA change state is trying to change device affinity and the cache has data for this device. 02A10100 Pubs not one when transportable. 02A20100 Pubs not one when transportable. HSZ50 Array Controller Service Manual...
  • Page 268 The FX detected a compare error for data that was identical. This error has always previously occurred due to a hardware problem. 02AE0100 The mirrorset member count and individual member states are inconsistent. Discovered during a mirrorset write or erase. Service Manual HSZ50 Array Controller...
  • Page 269 Copy_buff_on_this routine expected the given page to be marked bad and it wasn’t. 02C10100 Copy_buff_on_other routine expected the given page to be marked bad and it wasn’t. 02C60100 Mirroring transfer found CLD with writeback state OFF. HSZ50 Array Controller Service Manual...
  • Page 270 An invalid storage set type was specified for metadata initialization. 02D72390 Forced failover of devices due to a cache battery failure. This was initiated because the dual partner was operational with a good battery and there is no host failover assistance. Service Manual HSZ50 Array Controller...
  • Page 271: Table A-4 Device Services Last Failure Codes

    Last Failure Parameter[0] contains the SCSI command opcode. 03080101 Invalid SCSI OPTICAL MEMORY device opcode in misc command DWD. Last Failure Parameter[0] contains the SCSI command opcode. 030A0100 Error DWD not found in port in_proc_q. HSZ50 Array Controller Service Manual...
  • Page 272 NULL Physical Unit Block (PUB) pointer 03320101 An invalid code was passed to the error recovery thread in the error_stat field of the PCB. Last Failure Parameter[0] contains the PCB error_stat code. Service Manual HSZ50 Array Controller...
  • Page 273 Last Failure Parameter[5] contains the PCB copy of the device port DSPS register. Last Failure Parameter[6] contains the PCB copies of the device port SSTAT2/SSTAT1/ registers. Last Failure Parameter[7] contains the PCB copies of the device port LCRC/ RESERVED/ISTAT/DFIFO registers. HSZ50 Array Controller Service Manual...
  • Page 274 DSPS register. Last Failure Parameter[6] contains the PCB copies of the device port SSTAT2/ SSTAT1/SSTAT0/DSTAT registers. Last Failure Parameter[7] contains the PCB copies of the device port LCRC/RESERVED/ ISTAT/DFIFO registers. Service Manual HSZ50 Array Controller...
  • Page 275 DSPS register. Last Failure Parameter[6] contains the PCB copies of the device port SSTAT2/SSTAT1} /SSTAT0/DSTAT registers. Last Failure Parameter[7] contains the PCB copies of the device port LCRC/ RESERVED/ISTAT/DFIFO registers. HSZ50 Array Controller Service Manual...
  • Page 276 Last Failure Parameter[7] contains the PCB copies of the device port LCRC/ RESERVED/ISTAT/DFIFO registers. 033C0101 An invalid code was seen by the error recovery thread in the er_funct_step field of the PCB. Last Failure Parameter[0] contains the PCB er_funct_step code. Service Manual HSZ50 Array Controller...
  • Page 277 DSPS register. Last Failure Parameter[6] contains the PCB copies of the device port SSTAT2/SSTAT1 /SSTAT0/DSTAT registers. Last Failure Parameter[7] contains the PCB copies of the device port LCRC/ RESERVED/ISTAT/DFIFO registers. HSZ50 Array Controller Service Manual...
  • Page 278 Insufficient memory available for static structure allocation. 034D0100 DS init DWDs exhausted. 034E2080 Diagnostics report all device ports are broken. 03500100 Insufficient memory available for command disk allocation. 03510100 Insufficient resources available for command disk data region. Service Manual HSZ50 Array Controller...
  • Page 279: Table A-5 Fault Manager Last Failure Codes

    04030102 The USB index supplied in the EIP is larger than the maximum number of USBs. Last Failure Parameter[0] contains the instance, code value. Last Failure Parameter[1] contains the USB index value. HSZ50 Array Controller Service Manual...
  • Page 280 Last Failure Parameter[1] contains the instance, code value. 04090100 The caller of FM, CANCEL_EVENT_NOTIFICATION passed an address of an event notification routine which does not match the address of any routines for which event notification is enabled. Service Manual HSZ50 Array Controller...
  • Page 281 Last Failure Parameter[0] contains the unexpected template value. 04110101 Unexpected instance, code found during fmu_memerr_report processing. Last Failure Parameter[0] contains the unexpected instance, code value. 04120101 CLIB, SDD_FAO call failed. Last Failure Parameter[0] contains the failure status code value. HSZ50 Array Controller Service Manual...
  • Page 282: Table A-6 Common Library Last Failure Codes

    Table A–7 DUART services last failure codes Last Fail Code Explanation Repair Action Code 06010100 The DUART was unable to allocate enough memory to establish a connection to the CLI. Service Manual HSZ50 Array Controller...
  • Page 283: Table A-8 Failover Control Last Failure Codes

    The other controller killed this controller, but could not assert the kill line because nindy was on or in debug. It killed this controller now. 07080000 The other controller crashed, so this one must crash too. HSZ50 Array Controller Service Manual...
  • Page 284: Table A-9 Nonvolatile Parameter Memory Failover Control Last Failure Codes

    080E0101 An out-of-range receiver ID was received by the NVFOC communication utility (master send to slave send ACK). Last Failure Parameter[0] contains the bad id value. Service Manual HSZ50 Array Controller...
  • Page 285 Last Failure Parameter[0] contains the id type value that was received on the NVFOC remote work queue. 081C0101 Bad member management work received. Last Failure Parameter[0] contains the bad member management value that was detected. HSZ50 Array Controller Service Manual...
  • Page 286: Table A-10 Facility Lock Manager Last Failure Codes

    Remote FLM detected that the other controller has a facility lock manager at an incompatible revision level with this controller. Last Failure Parameter[0] contains this controller’s FLM revision. Last Failure Parameter[1] contains the other controller’s FLM revision. Service Manual HSZ50 Array Controller...
  • Page 287: Table A-11 Integrated Logging Facility Last Failure Codes

    This controller requested this controller to restart. 20090010 This controller requested this controller to shutdown. 200A0000 This controller requested this controller to selftest. 200B0100 Could not get enough memory for FCBs to other receive information from the controller HSZ50 Array Controller Service Manual...
  • Page 288 This is how this controller is restarted in COPY=OTHER. 20160100 Unable to allocate resources needed for the CLI local program. 20180010 User requested this controller’s parameters to be set to initial configuration state. Service Manual HSZ50 Array Controller...
  • Page 289: Table A-13 Host Interconnect Services Last Failure Codes

    Last Failure Parameter[0] contains the S_ci_max_nodes value. 402E0101 S_max_node not set to valid value (8, 16, 32, 64, 128, 256). Last Failure Parameter[0] contains the S_ci_max_nodes value. 402F0100 Failure to allocate a HIS EIP structure. HSZ50 Array Controller Service Manual...
  • Page 290: Table A-14 Scsi Host Interconnect Services Last Failure Codes

    Table A–14 SCSI host interconnect services last failure codes Last Failure Explanation Repair Code Action Code 41000100 Encountered an unexpected structure type on S_shis_ctl.scsi_q. 41020100 Unable to allocate the necessary number of HTBS in shis_init(). Service Manual HSZ50 Array Controller...
  • Page 291: Table A-15 Host Interconnect Port Services Last Failure Codes

    Cannot start timer. 42030100 Cannot restart work timer. 42040100 Host port buffer allocation macro found an error allocating free buffers. The free buffer was NULLPTR. , DEBUG conditional. 42060100 HP_INIT could not allocate initial buffers. HSZ50 Array Controller Service Manual...
  • Page 292 Scan packet que found bad path select case for DSSI 427A6601 Host port found that the controller has exceeded the maximum number of user specified host VCs Last Failure Parameter[0] is a 32-bit MASK of OPEN VCs the controller sees to host nodes. Service Manual HSZ50 Array Controller...
  • Page 293: Table A-16 Disk And Tape Mscp Server Last Failure Codes

    602D0100 The VA, CHANGE_STATE service did not set the Software write protect as requested (for disk). 602E0100 The VA, CHANGE_STATE service did not set the Software write protect as requested (for tape). HSZ50 Array Controller Service Manual...
  • Page 294: Table A-17 Diagnostics And Utilities Protocol Server Last Failure Codes

    This last_failure code was removed from HSOF firmware at Version 2.7 610C0100 HIS has reported a connection event that should not be possible. Service Manual HSZ50 Array Controller...
  • Page 295: Table A-18 System Communication Services Directory Last Failure Code

    DILX tried to change the usb unit state from MAINTENANCE_MODE to NORMAL but DILX never received notification of a successful state change 80060100 DILX tried to switch the unit state from MAINTENANCE_MODE to NORMAL but was not successful HSZ50 Array Controller Service Manual...
  • Page 296: Table A-21 Tape Inline Exerciser (Tilx) Last Failure Codes

    Last Fail Code Explanation Repair Action Code 81010100 An HTB was not available to issue an I/O when it should have been 81020100 A unit could not be dropped from testing because an available cmd failed Service Manual HSZ50 Array Controller...
  • Page 297 TILX calculated an illegal position type value while trying to generate a cmd for the position intensive phase of the Basic Function test 81140100 While trying to print an Event Information Packet, TILX discovered an unsupported MSCP error log format HSZ50 Array Controller Service Manual...
  • Page 298: Table A-22 Device Configuration Utilities (Config/Cfmenu) Last Failure Codes

    CONFIG utility completed within the timeout interval 83050100 An unsupported message type or terminal request was received by the CFMENU utility code from the CLI 83060100 Not all alter_device requests from the CFMENU utility completed within the timeout interval Service Manual HSZ50 Array Controller...
  • Page 299: Table A-23 Clone Unit Utility (Clone) Last Failure Codes

    Table A–25 Code load/code patch utility (CLCP) last failure codes Last Fail Code Explanation Repair Action Code 86000020 Controller was forced to restart in order for new code load or patch to take effect. HSZ50 Array Controller Service Manual...
  • Page 300: Table A-26 Induce Controller Crash Utility (Crash) Last Failure Codes

    Table A–26 Induce controller crash utility (CRASH) last failure codes Last Fail Code Explanation Repair Action Code 88000000 Controller was forced to restart due to the execution of the CRASH utility. Service Manual HSZ50 Array Controller...
  • Page 301: Repair Action Codes

    Determine which blower failed and replace it. Replace the power supply. Replace the cable. Refer to the specific device documentation. Determine power failure cause. Restore on-disk configuration information to original state. Determine which SBB has a failed connector and replace it. HSZ50 Array Controller Service Manual...
  • Page 302 The EIP is used to notify that the repair was successful. Replace the controller module. Replace the indicated cache module, or the appropriate memory SIMMs located on the indicated cache module. Replace the indicated write cache battery. Caution: BATTERY REPLACEMENT MAY CAUSE INJURY. Service Manual HSZ50 Array Controller...
  • Page 303 If Master DRAB DSR register bit 14 is set, the failure was reported via the NMI. If Master DRAB DSR register bit 14 is clear, the failure was reported via the DRAB_INT. Follow repair action 36. HSZ50 Array Controller Service Manual...
  • Page 304 If Master DRAB DSR register bit 14 is set, the failure was reported via the NMI. If Master DRAB DSR register bit 14 is clear, the failure was reported via the DRAB_INT. Follow repair action 34. Service Manual HSZ50 Array Controller...
  • Page 305 5 and WDR1 register bit 30 is clear. Master DRAB CSR register bits 10 through 12 contains the value 6 and WDR1 register bit 31 is clear. If none of the above conditions were true, follow repair action 36. HSZ50 Array Controller Service Manual...
  • Page 306 If Master DRAB DSR register bit 14 is set, the failure was reported via the NMI. If Master DRAB DSR register bit 14 is clear, the failure was reported via the DRAB_INT. Follow repair action 36. Service Manual HSZ50 Array Controller...
  • Page 307 For Write Data Parity Error conditions Bits 0 through 3 of the Master DRAB CSR register identify the byte in error. For Address Parity Error conditions follow repair action 34. For Write Data Parity Error conditions follow repair action HSZ50 Array Controller Service Manual...
  • Page 308 For Write Data Parity Error conditions bits 0 through 3 of the CACHEAn DRAB CSR register identify the byte in error. For Address Parity Error conditions follow repair action 34. For Write Data Parity Error conditions follow repair action Service Manual HSZ50 Array Controller...
  • Page 309 14 is clear, the failure was reported via the DRAB_INT. If bits 20 through 23 of the Master DRAB DCSR register contain a non-zero value, a firmware fault is indicated; follow repair action 01, otherwise, follow repair action 36. HSZ50 Array Controller Service Manual...
  • Page 310 Excessive VC closures are occurring. Perform repair action 61 on both sets of path cables. If the problem persists, perform repair action 63. Polling failed to complete in a timely manner. Perform repair action 61 on all path cables. Service Manual HSZ50 Array Controller...
  • Page 311 Increase the maximum number of hosts allowed value. Perform repair action 61. If the problem persists, perform repair action 20. The external cache battery cable might have been disconnected. HSZ50 Array Controller Service Manual...
  • Page 313: Glossary

    Glossary HSZ50 Array Controller Service Manual...
  • Page 314 Cable distribution unit. The power entry device for StorageWorks cabinets. The unit provides the connections necessary to distribute ac power to cabinet shelves and fans. Command line interpreter. Operator command line interface for the HS family controller firmware. Service Manual HSZ50 Array Controller...
  • Page 315 Two controllers in one controller shelf providing the ability for one controller to take over the work of the other controller in the event of a failure of the other controller. HSZ50 Array Controller Service Manual...
  • Page 316 A group of disk drives that have been removed from RAIDsets due to a failure or a manual removal. Disk drives in the failedset should be considered defective and should be tested, repaired, and then placed into the spareset. Service Manual HSZ50 Array Controller...
  • Page 317 A method of replacing a device whereby the system that contains the device remains online and active during replacement. The device being replaced is the only device that cannot perform operations during a hot swap. HSZ50 Array Controller Service Manual...
  • Page 318 Data written on the physical disk that is not visible to the host/customer that allows the HS array controller to maintain a high integrity of customer data. mirrorset Two or more physical disks configured to present one highly reliable virtual unit to the host. Service Manual HSZ50 Array Controller...
  • Page 319 A device that has been fully tested in an approved StorageWorks configuration, (that is, shelf, cabinet, power supply, cabling, and so forth) and is in complete compliance with country-specific standards (for example, FCC, TUV, and so forth) and with all Digital standards. HSZ50 Array Controller Service Manual...
  • Page 320 System communication services. A delivery protocol for packets of information (commands or data) to or from the host. SCSI Small computer system interface. An ANSI interface defining the physical and electrical parameters of a parallel I/O bus used to connect initiators to a Service Manual HSZ50 Array Controller...
  • Page 321 A storage unit can be any entity that is capable of storing data, whether it is a physical device or a group of physical devices. HSZ50 Array Controller Service Manual...
  • Page 322 RAIDset. unwritten cached data Data in the write-back cache that has not yet been written to the physical device, but the user has been notified that the data has been written. VAXcluster console system. Service Manual HSZ50 Array Controller...
  • Page 323 This operation may update, invalidate, or delete data from the cache memory accordingly, to ensure that the cache does not contain obsolete data. The user sees the operation as complete only after the backup storage device has been updated. HSZ50 Array Controller Service Manual...
  • Page 325: Index

    Allocation class, G-2 removing, 5–5 Application error Cache modules controller generated event, 1–19 handling for ESD, 2–2 Application errors installing in an HSZ50 controller, device event, 1–11 3–48 overview, 1–11 replacing battery cells, 2–28 Array controller, G-2 CD-ROM drive, replacing, 2–45 Asynchronous swap, 2–41...
  • Page 326 EDC, G-4 DDL, G-3 Electrostatic discharge. See ESD DECevent log error example, 1–12, 1–20, 1–26 host adapter bad, 1–7 examples, 1–33 host SCSI bus bad, 1–7 Deleting HSZ bad, 1–8 cache modules, 5–5 ESD, G-4 Service Manual HSZ50 Array Controller...
  • Page 327 Index I - 3 guidelines, 2–2 Installation installing a cache module, 3–48 installing a controller into a shelf, 3–41 Failedset, G-4 installing a second controller, 3– Failover, G-5 Fault Management Utility. See FMU installing power supplies into a File utility shelf, 3–53 output, 1–9 installing SBBs, 3–57...
  • Page 328 5–6 storagesets, 4–3 Removing controllers, 2–18 Repair action codes, A-91 Replacement procedures Non-redundant configuration, G-7 battery cells, 2–28 Normal member, G-7 CD−ROM drives, 2–44 NV, G-7 controllers, 2–3 drives, 2–44 power supplies, 2–39 Service Manual HSZ50 Array Controller...
  • Page 329 Index I - 5 SCSI device port cables, 2–49 Storage device building blocks. See SCSI host cables, 2–47 SBBs solid state disk drives, 2–45 Storage devices storage devices, 2–42 replacing, 2–42 tape drives, 2–44 Storage unit, G-10 write-back cache battery cells, 2– Storageset members, 4–6 Reserved CDB fields, 3–8...
  • Page 330 2–44 default display, 1–60 Write hole, G-11 device SCSI port performance, 1– Write protection, 3–9 Write-back cache device SCSI status, 1–61 replacing battery cells, 2–28 device status, 1–69 Write-through cache, G-12 help, 1–71 Service Manual HSZ50 Array Controller...

Table of Contents