VAXft Systems Model 810 Service Information Order Number: EK-VXFTA-SI.A01 June 1993 This manual is intended for use by trained personnel responsible for maintaining VAXft Model 810 systems. Digital Equipment Corporation...
Page 2
June 1993 The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document.
Page 3
Documentation Map Hardware Information (VAXft Systems) Overview Software Models Model Information Operating System Information (VAXft Systems) 110, 410, 610, 612 (VMS) (VAXft System Services) Cover Software Product Before You Configuring Configuration Letter Description Install Letter the Model 810 Guide Release Notes Site Prep and Release Notes Installation...
Cabinet and Component Descriptions 1.1 In This Chapter This chapter includes descriptions of the: • CPU and expansion cabinets • Zone control panel • Power modules • Domestic power distribution box • International power distribution box 1.2 CPU and Expansion Cabinets Figure 1–1 shows the front layout of an expanded system.
Table 1–1 Key to Figure 1–1, Cabinet Layout, Front View Item Component Description Zone A Complete computer with enough elements to run an operating system. Zone B Complete computer with enough elements to run an operating system. Fan assembly Cooling device. Disk drawer Optional SF35 disk drive(s).
Table 1–2 Key to Figure 1–2, Cabinet Layout, Rear View Item Component Description Zone A Complete computer with enough elements to run an operating system. Zone B Complete computer with enough elements to run an operating system. Fan assembly Cooling device. Blank panel Not used.
1.3 Zone Control Panel Figure 1–3 shows the layout of the zone control panel. Table 1–3 describes the functions of the zone control panel controls and indicators. Figure 1–3 Zone Control Panel MR−0514−92RAGS 1–6 Cabinet and Component Descriptions...
Table 1–3 Key to Figure 1–3, Zone Control Panel Item Control/Indicator Function Logic Power - OFF Two switches with amber indicators. Pressing the two switches removes 48 V power and disables the zone. Pressing one switch has no effect on the operation of the zone. (CPU cabinet disk power is not affected when logic power is removed by pressing these switches.) Logic Power - ON...
1.4 Power Modules Figure 1–4 shows the location of the power module controls and indicators. Table 1–4 describes their functions. Figure 1–4 Power Module Controls and Indicators CAMP MR−0483−92RAGS 1–8 Cabinet and Component Descriptions...
Table 1–4 Key to Figure 1–4, Power Module Controls and Indicators Item Control/Indicator Function AC Circuit Breaker FEU Failure When on, indicates the dc output voltages for the FEU are below the specified minimum. FEU OK When on, indicates the dc output voltages for the FEU are above the specified minimum.
Figure 1–5 Domestic Power Distribution Box M R - 0 4 9 8 - 9 2 D G Table 1–5 Key to Figure 1–5, Domestic Power Distribution Box Item Component Description Three-phase power cord Connects the power distribution box to ac power. The power cord may be repositioned by moving the locking arm.
Figure 1–6 International Power Distribution Box M R - 0 4 9 9 - 9 2 D G Table 1–6 Key to Figure 1–6, International Power Distribution Box Item Component Description Single-phase power cord Connects the power distribution box to ac power. Circuit breaker When set to on, ac power is applied to the distribution box.
Console Operations 2.1 In This Chapter This chapter describes the console, console operating modes and commands, and booting information. This chapter includes: • Console description • Console operating modes • Console control characters • Console command language syntax • Bootstrap procedures •...
Figure 2–1 System Components MR−0486−92RAGS Table 2–1 Key to Figure 2–1, System Components Number Component CPU cabinet Zone (A or B) CPU module To memory Primary NCIO module Cross-link cable Local console terminal Remote console terminal (optional) 2–2 Cabinet and Component Descriptions...
Table 2–2 describes the function of each console component. Table 2–2 Function of the Console Components Part Function Local console terminal Terminal located with the system that is used for console input and display output. Remote console port One remote port is available in each zone. The port may be connected to a remote console terminal through a modem.
2.3.1 Entering CIO Mode The CIO mode is entered when you turn on system power if: • The Zone Halt Enable switch is pressed • A STOP/ZONE instruction is executed • A severe processor condition occurs • An external halt is detected Once entered, the console prompt >>>...
2.4 Console Control Characters The ASCII control characters and function keys listed in Table 2–3 have special meanings when typed on a console terminal. Table 2–3 Console Control Characters and Function Keys Character/Key Function In CIO mode, acts like . In PIO mode, causes the processor to Break Ctrl/C halt and begin running the console program.
2.5 Console Command Language Syntax The console commands accept qualifiers. Qualifiers specify a numerical value or select an option from a list of options. Command elements may be abbreviated and any extra tabs or spaces are ignored. Unless otherwise noted, numerical values must be given in hexadecimal notation.
2.6 Bootstrap Procedures The BOOT command initializes the system and then loads and starts the virtual memory bootstrap (VMB) program from read-only memory (ROM). The VMB program, in turn, loads and starts the operating system from the specified boot device. Figure 2–3 shows the steps in the boot procedure. Figure 2–3 Boot Procedure Enter BOOT command at the >>>...
Page 34
2.7 Entering CIO Mode To recognize and process CIO commands: • The System Halt Enable switch on both zone control panels must be pressed • The operating software must be halted • The processor must be running the console firmware The example below shows how to use the key to enter CIO mode from PIO Break...
2.8 CIO Mode Console Commands This section describes the CIO mode console commands. The console commands are listed below with command abbreviations shown in bold capital letters. Boot HElp SHow CLEAR Initialize Start Continue Move Test Deposit MATCH_ZONES X(transfer) Repeat Examine !(comment) Find...
Table 2–6 VMB Program /R5:<flag> Values Hex Value Function Action Conversational Returns to the SYSBOOT> prompt. boot Debug Maps the XDELTA program into the system page table. Initial Operating system issues a breakpoint after breakpoint turning on memory management. Secondary boot Boots from boot block specified in /R4:n.
2.8.3 CONTINUE CONTINUE exits the CIO mode and returns operation to the PIO mode. Caution Use CONTINUE to continue from a system halt. Use START/ZONE to continue from a zone halt. The CONTINUE syntax is: CONTINUE 2.8.4 DEPOSIT DEPOSIT stores the specified data in the specified address. When the system is initialized or when any transition from a running to a halted state occurs, the defaults are physical address space 0 and data size longword.
Table 2–9 Address-Spec Symbolic Addresses Symbolic Address Description R<n> General purpose register number n, where n is a decimal number 0 to 15. Frame pointer. Argument pointer. Stack pointer. Program counter. Program status longword. A location following the last location accessed by an EXAMINE or DEPOSIT.
2.8.5 DUP DUP connects to the DSSI DUP service on a selected node. DUP is used to examine and modify the parameters of a DSSI device. DUP syntax is: DUP[/PATH:<path-number>] node-id /[TASK:task] The node-spec identifies the node number (0 to 7) of a DSSI device attached to the console.
Table 2–11 Qualifiers for EXAMINE Qualifier Function Sets the data size to byte. Sets the data size to word. Sets the data size to longword. Sets the data size to quadword. Sets general purpose register address space R0 through PC. Sets internal processor register (IPR) address space accessed by the MTPR and MFPR instructions.
2.8.7 FIND FIND searches the main memory beginning at physical address space 0 for either a page-aligned 512-Kbyte segment of memory, or a restart parameter block (RPB). When FIND is successful, it saves the address plus the segment of memory (or RPB) in the stack pointer.
2.8.9 INITIALIZE INITIALIZE performs the steps shown in Table 2–14. Table 2–14 INITIALIZE Steps Step Action Do hard reset of zone (the cross-link state is set to off). Do hard reset of all available ATMs. Initialize hardware. Reconfigure the zone and update the device configuration block (DCB) to reflect the zone status.
2.8.12 REPEAT REPEAT continuously executes the specified command. REPEAT applies to the following commands only. • DEPOSIT • EXAMINE REPEAT can be aborted by pressing at the console keyboard. Ctrl/C The REPEAT syntax is: REPEAT command 2.8.13 SET SET modifies the value of the specified variable. The SET syntax is: SET variable value [value] Note...
2.8.13.1 SET BOOT SET BOOT saves the values of boot-specs. Space for nine boot-specs is available on the CPU module EEPROM. The first space is reserved for the default boot- spec. The other eight spaces are available to the user. The SET BOOT syntax is: SET BOOT DEFAULT value SET BOOT boot-spec value...
Table 2–16 (Cont.) SHOW Variables Variable Description Acceptable Values DSSI/PATH=path- Specifies the zone and number slot number of an adapter connecting to a DSSI device. The path-number format is zss, where: z is the zone ID (A or B). ss is the slot number (10 to 17, 20 to 27) of an adapter connecting to a DSSI device.
2.8.16 TEST TEST enables the user to test: • The system • A zone • The CPU and memory Use TEST only when the cross-link state is set to off. The TEST syntax is: TEST [qualifier(s)] Tables 2–17 and 2–18 describe the TEST selection and control qualifiers. Table 2–17 Qualifiers for TEST Selection Qualifier Function...
Table 2–18 Qualifiers for TEST Control Qualifier Function /PASSCOUNT:n n is a decimal number from 0 to MAXINT. When n is 0, the passcount is infinite. /NOTRACE Disables the test traces. /COE Continues on error. /NOCONFIRM Disables the test confirmation on destructive tests. /EXTENDED Enables extended error reports.
2.8.18 Z Z connects to the firmware of another module in the system. The Z syntax is: Z[/PATH=path-number] Table 2–19 describes the qualifier. Table 2–19 Qualifier for Z Qualifier Function /PATH=path-number Specifies the zone and slot number of a module. The path- number format is zss, where: z is the zone ID (A or B).
System Maintenance 3.1 In This Chapter This chapter includes: • Maintenance strategy • Operating rules and cautions • General troubleshooting procedure • Module fault LEDs • Power system overview • Power system maintenance • Device status and fault indicators • ROM-based diagnostics 3.2 Maintenance Strategy When a hardware component fails, the Model 810 system uses self-diagnosis...
3.3 Operating Rules and Cautions Table 3–1, Table 3–2, and Table 3–3 contain operating rules for use during a service call. Table 3–4 provides cautions. Table 3–1 Before Stopping a Zone Step Action Do not depend on the accuracy of a zone ID label. Issue SHOW ZONE before STOP/ZONE to check the states of both zones.
Table 3–3 Before Leaving the Site Step Action Issue SHOW DEVICE D to make sure that all disks are either shadow set members or in the process of being copied. Issue SHOW DEVICE E to make sure that all EP/EF drivers are on line. Use FTSS$FSM to show the failover set status: MCR FTSS$FSM Return...
Table 3–4 Cautions Do not press ZONE HALT ENABLE and the key to stop a running zone. Break Use STOP/ZONE. If ZONE HALT ENABLE is used, CONTINUE will not resume zone operation. Do not press the Break key or cycle power during the power on or RBD tests. This action may corrupt the EEPROM.
Page 53
Table 3–5 (Cont.) General Troubleshooting Procedure Step Action If the replaced FRU did not correct the problem, open the system cabinet front door. Check all module and disk drawer fault LEDs. If any fault LED is on, replace the associated module or device. (See Chapter 5, FRU Removal and Replacement Procedures.) If no module or disk fault LED is on, open the system cabinet rear door.
Table 3–5 (Cont.) General Troubleshooting Procedure Step Action If the problem cannot be isolated and repaired, the service call should be escalated to the Customer Service Center for further action. 3.5 Module Fault LEDs Figure 3–1 shows all module fault LED locations. Table 3–6 identifies each module.
Table 3–6 Key to Figure 3–1, Module Fault LEDs Module CPU module ATM module System Fault (zone control panel) Front end unit DC3 converter DC5 converter Power system controller Console module CAMP module DSSI and Ethernet interface modules 3.6 Power System Overview The following sections describe the power distribution and power components.
Figure 3–2 Power System Block Diagram (1 of 2) UTILITY POWER INPUT 120 Vac, 60 Hz Optional Uninterruptible 240 Vac, 50 Hz Power System AC POWER OUTPUT AND DISTRIBUTION With UPS: AC Power Distributed to Power System and Expansion Cabinets Distribution Without UPS: AC Power Distributed Boxes...
Table 3–7 Power System Functional Summary Functional Summary Local Disk Converter An LDC is located in each in-zone disk drawer. It provides (LDC) +12 Vdc with fast transit response and tolerance to short-term loading during disk spinup. Also provides +5 Vdc for power logic, and EMI filtering for the 48 V bus.
Page 59
Table 3–7 (Cont.) Power System Functional Summary Functional Summary DC5 H7179-AA DC to dc converter which provides +5 Vdc to the CPU, MMB, SIMMs, I/O ATM, interface and console extender modules, as well as +5 Vdc to the I/O ATM internal +5 Vdc to +3.3 Vdc converter for the SOC.
Figure 3–4 Power Module Controls and Indicators CAMP MR−0483−92RAGS Table 3–9 Key to Figure 3–4, Power Module Controls and Indicators Item Control/Indicator Function Repair Action AC Circuit Breaker FEU Failure When on, indicates the Replace the FEU. See dc output voltages for the Chapter 5.
Page 62
Table 3–9 (Cont.) Key to Figure 3–4, Power Module Controls and Indicators Item Control/Indicator Function Repair Action DC3 OK When on, indicates that the output voltages are within the specified tolerances. AC Present When on, indicates ac power If ac power is present, is present at the ac input check the power source and connector, regardless of the...
Table 3–9 (Cont.) Key to Figure 3–4, Power Module Controls and Indicators Item Control/Indicator Function Repair Action Fault ID Display Displays the power subsystem fault codes. PSC Reset Button When out, indicates a PSC Press in to reset. fault condition. CAMP Fan Fault When on, indicates that a fan Replace the fan.
Table 3–11 FEU Error Codes Error Code Failure Error Description E200 48V_SWITCHED OK before enabling E201 Fan converter operating before enabling E202 HVDC is OK, but POWER is not OK (contradictory status) E203 The ac current is not OK (in idle state/loop) E204 48V_DIRECT is not OK and POWER is OK (IRQ18) E205...
Table 3–15 3 V DC to DC Converter Error Codes Error Code Fault Error Description E120 Out of regulation low E121 Out of regulation high E122 Undervoltage E123 Overvoltage E124 Voltage present when disabled E125 Did not turn off Table 3–16 5 V DC to DC Converter Error Codes Error Code Fault...
Figure 3–5 RF35 Disk Drawer Controls and Indicators D1 D2 FAULT WRITE PROT LINE 0−1 ON/OFF SET UP D4 D5 FAULT WRITE PROT LINE 0−1 ON/OFF SET UP MR−0436−92RAGS Table 3–18 RF35 Disk Drawer Controls and Indicators Control/Indicator Color State Operating Condition Fault Drive is faulty.
3.8.2 SF35 Storage Array Figure 3–6 shows the operator control panel. Table 3–19 describes their functions. Figure 3–7 shows the rear of the storage array. Table 3–20 describes the functions of the controls and indicator located at the rear of the storage array. Figure 3–6 SF35 Operator Control Panel Operator Control...
Table 3–19 SF35 Operator Control Panel Description Control/Indicator Function Ready Push-to-set switch with green indicator. Brings the integrated storage element (ISE) on-line in about 10 seconds. The indicator remains on while the ISE is on-line. Write Protect Push-to-set switch with amber indicator. Write protects the data on the ISE.
Figure 3–7 SF35 Rear Panel Fault Indicator DSSI Connectors d i g i t a l F A U L T F A U L T F A U L T Power Supply Fault Indicator (Behind Panel) Line Voltage AC Power Selector Switch Switch (Behind Panel)
3.8.3 SF73 Storage Array Figure 3–8 shows the SF73 storage array status and fault indicators. Table 3–21 descibes their functions. Figure 3–9 shows the controls and indicator located at the rear of the storage array. Figure 3–8 Location of SF73 Storage Array LEDs and Switchpacks d i g i t a l Write DSSI...
Figure 3–9 Rear of the SF73 Storage Array DSSI Connectors F A U L T F A U L T F A U L T Power Supply Fault Indicator (Behind Panel) Line Voltage AC Power Selector Switch Switch (Behind Panel) M R - 0 4 2 2 - 9 2 D G System Maintenance 3–25...
3.8.4 TF85C Tape Drive Table 3–22 may help you define and correct TF85C tape drive problems. Table 3–22 TF85C Tape Drive Problems Problem Possible Solution Correctable failure If the TF85C drive fails during operation, reset the the drive, then during operation rewind, unload, and remove the cartridge.
Table 3–23 TF85C Cartridge Tape Drive Indicators Indicator Color State Operating Condition Write Protected Orange Tape is write-protected. Tape is write-enabled. Tape in Use Yellow Blinking Tape is moving. Tape is loaded; ready for use. Use Cleaning Orange Drive head needs cleaning or tape is bad. Tape If it remains on after Then the cleaning was not completed because the...
Figure 3–11 TF857 Operator Control Panel Operator Control Panel E j e c t Button L o a d / U n l o a d Indicator Mode Select Key Area S l o t S e l e c t Disabled Automatic OCP Label...
Table 3–24 (Cont.) TF857 OCP Controls and Indicators Control/Indicator Color Function Load/Unload button – Loads the currently selected cartridge into the drive, or unloads the cartridge from the drive to the magazine. If the Loader Fault or Magazine Fault indicators are on, can also be used to reset the subsystem.
3.9.1 TEST TEST enables the user to test: • The system • A zone • The CPU and memory Use TEST only when the cross-link state is set to off. The TEST syntax is: TEST [qualifier(s)] Tables 3–25 and 3–26 describe the TEST selection and control qualifiers. Table 3–25 Qualifiers for TEST Selection Qualifier Description...
3.9.2 Z Z connects to the firmware of another module in the system. It is also used to initiate I/O ROM-based diagnostics. The Z syntax is: Z[/PATH=path-number] Table 3–27 describes the qualifier. Table 3–27 Qualifier for Z Qualifier Function /PATH=path-number Specifies the zone and slot number of a module.
Page 80
Table 3–28 (Cont.) CPU ROM-Based Diagnostic Descriptions Group Test Subtest Description G: 0 T: 1 S: 0 P-CACHE Register Bit Test G: 0 T: 1 S: 1 P-CACHE Tag Integrity Test G: 0 T: 1 S: 2 P-CACHE Data Integrity Test G: 0 T: 1 S: 3...
Page 81
Table 3–28 (Cont.) CPU ROM-Based Diagnostic Descriptions Group Test Subtest Description G: 0 T: 7 S: 5 DMA Sub-Trasfer Length Test G: 0 T: 7 S: 6 DMA I/O Byte Alignment Test G: 0 T: 7 S: 7 DMA Memory Byte Alignment Test G: 0 T: 7 S: 8...
Error Handling and Analysis 4.1 In This Chapter This chapter includes: • Error handling services overview • Field replaceable units • OpenVMS error log • Module NVRAM status and LED indicators • FTSS error reporting interface • Firmware interfaces • Firmware and OpenVMS interface data structures •...
EHS error notification is described in Table 4–1. Table 4–1 EHS Error Notification Step Action Entries are made into the system error log. Status information is written to the module ID NVRAM and the DCB, where applicable. The LED indicator associated with a failed module is set. A call is issued to the error reporting interface (ERI) which reports the event to the FTSS$SERVER.
Table 4–2 Error Handling Flowchart Definitions Event Definition Hardware reports error through a high-level interrupt and control is transferred to the EHS. The EHS examines system registers to determine the type of failure which has occurred. The EHS identifies the FRU that is the source of the error. FRU isolation is generally accomplished at the module level.
Figure 4–2 EHS Architectural Position Error Handling Services Functions System Utilities Error Reporting Interface Error Event Notification System Error Log IZC Routines Serial Interrupts Remote Zone Zone Available Interface Serial Transmit/Receive Resets VAXELN and Diagnostics Firmware Interface Status Console and Diagnostics Registers Hardware Interface System Hardware...
Table 4–3 (Cont.) System Operating Modes Mode Definition Duplex The memories in both zones are identical and both CPUs are running in lockstep. The I/O subsystems of both zones are available and in use. The cross-link state in both zones is Duplex. The system can be booted in this mode, or can transition to this mode as the result of the synchronization process from either Simplex or Degraded Duplex modes.
Page 90
Table 4–4 (Cont.) Error Types Error Type Definition Double-Bit Hardware reports a double-bit error (DBE) when the ECC checkers detect memory this condition on a read from a main memory location. This read can occur errors during a DMA or CPU cycle, with two possible error causes: a memory failure or a programming error.
Page 91
Table 4–4 (Cont.) Error Types Error Type Definition Single-Bit Single-Bit Errors (SBEs) can be detected by either the JXD during a DMA memory read cycle which reads from main memory or the CPU during a memory errors read. Software action varies depending upon the system operating mode and where the error detection occurs.
Page 92
Table 4–4 (Cont.) Error Types Error Type Definition Power If a zone loses power in a non-Simplex configuration, hardware generates failures an interrupt to report the event to the EHS. In a non-Duplex mode, software will detect this error only when the slave zone loses power. In this case, the slave zone is removed from the configuration and the system continues to run in Simplex mode.
Page 93
Table 4–4 (Cont.) Error Types Error Type Definition Halt errors A halt error occurs when the system is operating in Duplex mode, the Zone Halt Enable switch on the zone control panel is pressed, and the Break is pressed on one of the system consoles, or one zone experiences errors on its halt lines.
Table 4–4 (Cont.) Error Types Error Type Definition I/O errors The ATM module contains a series of checkers that verify consistency between the dual rails of the system during I/O accesses. When discrepancies are detected, the hardware generates an interrupt, invoking the EHS.
Table 4–5 describes the VAXELN error classes and the actions taken by the EHS. Table 4–5 VAXELN Error Classes Error Class Description EHS Actions VAXELN Kernel This error is reported when the The FRU is the I/O expansion Fatal VAXELN kernel detects a fatal module.
4.3 Field Replaceable Units (FRUs) After analyzing error information and determining the error type, the EHS isolates the source of the error to a FRU. If the error was solid, the system is deconfigured to remove the FRU from service. If the error is transient, it is compared against a threshold for the error type and FRU.
4.3.2 Deconfiguration This section describes the actions taken by the EHS when a FRU is identified as the source of a solid error or transient errors which exceed the FRU threshold. A table is provided for each FRU that describes the actions taken by the EHS when the FRU is deconfigured.
4.3.2.2 CPU Module and Memory When memory is deconfigured from the system, it is done by removing the CPU module on which the memory resides. Table 4–8 describes the OpenVMS operating system actions taken when a CPU module or memory is identified as the FRU and is deconfigured by the EHS. These actions are identical for CPU and memory failures.
Table 4–9 I/O Expansion Module Deconfiguration Actions Action Taken Description I/O hard reset The I/O expansion module which is being deconfigured is reset through the cross-link I/O hard reset register. Set I/O expansion The module I2C bus is used to turn on the LED for the failed module LED module.
4.3.2.5 Zone Table 4–11 describes the OpenVMS operating system actions taken when an entire zone is identified as the FRU and is deconfigured by the EHS. Note that some actions are dependent on the system operating mode. Table 4–11 Zone Deconfiguration Actions Action Taken Description Comments...
4.3.3 Application of Thresholds Application of thresholds by the EHS is rate based. An FRU exceeds its threshold when it accumulates a certain number of a given error type in a specified time period. Table 4–13 lists the thresholds associated with each FRU and error type. In most cases, more than one type of error can result in the isolation of an FRU.
Page 102
Table 4–13 (Cont.) FRU Thresholds Error Error Time Type Limit Period Comments I/O Expansion Module Transient When the threshold is exceeded, the module is NXIO errors deconfigured except in Simplex system. Transient When the threshold is exceeded, the module is I/O errors deconfigured except in Simplex system.
4.4 OpenVMS Error Log The EHS makes entries in the system error log for all system error interrupts. Figure 4–3 shows the format of the error log. With the exception of the Fault Data block, all blocks have fixed length. Figure 4–3 OpenVMS Error Log Format Number of Longwords Fault Summary...
4.4.1 Fault Summary The Fault Summary block contains the fault ID, fault flags describing the nature of the fault, the cross-link mode at the time the fault occurred, and the cross-link mode after the error handling was completed. All fields in this block are valid for all error entries.
Page 105
Table 4–15 (Cont.) Fault Summary Block Entry Descriptions Entry Contents 23 - Power gone end action (reserved for future use) 24 - Clock error end action 25 - Other zone halted end action (reserved for future use) 26 - Resynch abort error end action (reserved for future use) 27 - CPU-detected single-bit error end action 28 - JXD-detected single-bit error end action (reserved for future use)
Table 4–15 (Cont.) Fault Summary Block Entry Descriptions Entry Contents [07:04] - Not used XLINK_MODE_ Cross-link mode at the time of error. The following values are ERROR defined: 0 - Off (Simplex) 1 - Slave 2 - Master 3 - Duplex 4 - Not used 5 - RESYNCH_SLAVE 6 - RESYNCH_MASTER...
Table 4–16 FRU Information Block Entry Descriptions Entry Contents FRU_TYPE The following bits are defined: 01 - The FRU is a module in Zone A (FRU_DATA has slot ID) 02 - The FRU is a module in Zone B (FRU_DATA has slot ID) 03 - Zone A is the FRU 04 - Zone B is the FRU 05 - The cross-link cable is the FRU...
4.4.3 Deconfiguration Information This error log block contains information about any system deconfiguration performed by the EHS. Figure 4–6 identifies each entry in the block and the offset from the start of the block. Table 4–17 describes the content of each entry. Note For errors which require no system deconfiguration, only the FT_FLAGS fields will be filled in.
Page 109
Table 4–17 (Cont.) Deconfiguration Information Block Entry Descriptions Entry Contents DECONFIG_ This field shows the Zone A modules removed from service as MODULES a result of error handling. For example, if the source of a solid or excessive transient error were an I/O expansion module, all attached interface modules have been removed from service.
4.4.4 Threshold Information When the Transient Error flag is set in the FAULT_FLAGS field of the Fault Summary block, the isolated FRU error is compared to its error rate threshold. When threshold is exceeded, the FRU will be removed from the system. In addition, the Excessive Transient Errors flag is set in the FAULT_FLAGS field.
4.4.5 Fault Data The Fault Data block has a variable length specific to the class of the fault which occurred. The error class can be determined by the high-order four bits of the FAULT_ID field in the Fault Summary block (see Table 4–15). The six Fault Data types based on these fault classes are shown in Figure 4–8 and described in the following subsections.
Table 4–19 System Register Entry Descriptions Entry Content Offset SYSFLT JXD System Fault Register SYSADR JXD System Error Address Register DMAADR DMA Error Address Register DMA_IO_ADDR DMA Engine I/O Error Address Register JCSR_A JXD Control and Status Register - Zone A JCSR_B JXD Control and Status Register - Zone B JDIAG_P_A...
Figure 4–9 shows the format of this Fault Data block entry and its offset. Table 4–21 contains a brief description of the entry. Figure 4–9 End Action Timeout Block TIMEOUT_INT (Timeout Interval) MR−0013−93RAGS Table 4–21 End Action Timeout Block Entry Description Entry Content Offset...
Page 116
Table 4–22 (Cont.) VAXELN Detected Error Block Entry Descriptions Entry Contents 20080 DAL parity error. Read error - normal read 30080 Cache parity error. Read error - normal read 40080 Uncorrectable read data error. Read error - normal read 50080 DMA error.
Page 117
Table 4–22 (Cont.) VAXELN Detected Error Block Entry Descriptions Entry Contents Normal successful completion 7C04 Bad parameter count 7C0C Bad job or process creation 7C14 Bad string parameter length 7C1C Bad access mode 7C24 Bad stack 7C2C Bad object state 7C34 Bad object type 7C3C...
Table 4–22 (Cont.) VAXELN Detected Error Block Entry Descriptions Entry Contents 7D2C No virtual address space available 7D34 Power recovery signal 7D3C Quit signal 7D44 Remote port value 7D4C Process exit signal 7D54 Remote system currently unreachable 7D5C Interprocess signal 7D64 Remote system rejected username or password 7D6C...
Table 4–23 (Cont.) Software Detected Error Block Entry Descriptions Entry Contents Fatal memory error has occurred Single-bit error has occurred User command issued to stop a zone Unexpected machine check has occurred Software detected failure has occurred Solid NXIO error has occurred Excessive transient I/O expansion module errors have occurred A solid I/O error has occurred Excessive transient I/O errors have occurred...
Figure 4–12 shows the format of this Fault Data block and the offset of each field from the start of the block. Table 4–24 contains a brief description of each entry. Figure 4–12 Unsynchable Event Block COMPAT_STS (Test Status) DIAG_STS (Diagnostic Status) MR−0008−93RAGS Table 4–24 Unsynchable Event Block Entry Descriptions Description...
Table 4–24 (Cont.) Unsynchable Event Block Entry Descriptions Description CPU memory configuration mismatches with other zone Cables (cross-link/resynchronization) CPU is in burn-in mode Ethernet EEPROM mismatches with other zone CPU console firmware cannot be run in Duplex [31:28] Not used DIAG_STS System diagnostic status longword.
such cases, diagnostics on the remote zone are relied on to report the failure. Table 4–25 Module ID NVRAM/DCB Status Codes Status Code Description Affected Modules The threshold for CPU/MEM faults for CPU module this module has been exceeded. The threshold for resynch abort errors CPU module for this module has been exceeded.
4.6 FTSS Event Reporting Interface The EHS externalizes events by reporting them to the event reporting interface (ERI). The ERI, in turn, passes notification of the event to the FTSS$SERVER process. The server reports the event in one of three ways: 1.
Page 125
FTSS$_CLOCK_ENDTMO, Clock fault end action timeout on zone [zone_id] Facility: FTSS Explanation: When a clock fault occurs in a non-Simplex system, diagnostics normally run on the failed zone and, upon completion, report status back to the zone running the operating system. If this end action does not occur within a reasonable timeout period, the failure will be treated as solid and the zone will not be automatically resynchronized by FTSS.
Page 126
FTSS$_CPUDBE, Double-bit memory fault detected on [module_id] in slot [slot_ id], zone [zone_id] Facility: FTSS Explanation: A double-bit memory error has occurred. This indicates a solid memory failure. This error will only be reported in a Duplex system and a CPU module will be removed from service when it occurs.
Page 127
FTSS$_DBE_END, DBE end action complete Facility: FTSS Explanation: Error processing for a double-bit memory error has been completed and the CPU is available to be resynchronized. User Action: The system error log should be examined for entries which correspond to the double-bit error. These error logs will identify an FRU. FTSS$_DBE_ENDTMO, DBE end action timed out on zone [zone_id] Facility: FTSS Explanation: When double-bit memory errors occur in a Duplex system,...
Page 128
FTSS$_ELNJOBFATAL, VAXELN job fatal error detected on [module_id] in slot [slot_id], zone [zone_id] Facility: FTSS Explanation: A VAXELN job running on an I/O Expansion module has detected a fatal error and has terminated. This error results in the removal of the associated Interface module from the system. User Action: The system error log should be examined for entries which correspond to the VAXELN job fatal error.
Page 129
FTSS$_ELNMASFATAL, VAXELN master job fatal error detected on [module_ id] in slot [slot_id], zone [zone_id] Facility: FTSS Explanation: The VAXELN master job running on an I/O Expansion module has detected a fatal error and has terminated. This error results in the removal of the indicated I/O Expansion module and associated Interface modules from the system configuration.
Page 130
FTSS$_POWERGONE, Power gone fault detected on zone [zone_id] Facility: FTSS Explanation: Power has been lost in one of the zones. This error is compared to its error rate threshold. If the threshold is not exceeded, the zone will be automatically resynchronized when power returns. User Action: If power is restored and the zone is automatically resynchronized, no action is required on the part of the user.
Page 131
FTSS$_SOLIDNXIO, Solid NXIO fault detected on [module_type] in slot [slot_ id], zone [zone_id] Facility: FTSS Explanation: A fatal nonexistent I/O error has occurred when accessing the indicated I/O module. The module is removed from service by the operating system. User Action: The system error log should be examined for entries which correspond to the nonexistent I/O error.
Page 132
FTSS$_TRNSIOMOD, Transient I/O fault detected on [module_type] in slot [slot_id], zone [zone_id] Facility: FTSS Explanation: A transient I/O miscompare error was detected and attributed to the indicated module. These errors are compared to their error rate threshold. If the threshold is exceeded and the system mode is not Simplex, the module is removed from service.
FTSS$_ZONEHALT, Zone Halt fault detected on zone [zone_id] Facility: FTSS Explanation: A single zone of a Duplex system has been halted. This can be caused by a user command on the system console or by a system error. User Action: If the Halt was caused by a user command on the system console, a START/ZONE command must be executed to restore the zone to service.
FTSS$_DECONFIG_EXMOD, I/O expansion module in slot [slot_id], zone [zone_ id] has been removed from service Facility: FTSS Explanation: Due to one or more system errors, the indicated I/O Expansion module and its associated Interface modules have been removed from service. User Action: The system error log should be examined for entries which correspond to the removal of the I/O expansion module.
4.7.1.1 System Resets When the EHS determines that a zone or CPU should be removed from the configuration, it forces a reset on the CPU. The reset results in the system console being invoked from serial ROM by the hardware. When system console runs, it attempts to determine the reason for the reset, which in turn may determine the actions performed by the console.
Table 4–27 System Reset Reason Codes Decimal Value Description When the EHS detects zone divergence, it selects one zone to continue the OpenVMS operating system and one zone to stop. Note that the OpenVMS operating system is not indicating an error in this zone; it must stop one of the two.
4.7.1.2 CCA Fields When a CPU or zone completes diagnostics, it enters its halt loop, which reports its status to the OpenVMS operating system in the other zone through the IZC service. The IZC service will in turn call the OpenVMS operating system to report the availability of the other zone.
Table 4–29 I/O Reset Action Code Description Decimal Value Description This reset code will cause the I/O expansion module console to invoke diagnostics. The diagnostics which run depend upon the mode of the cross-link at the time. After diagnostics, console will enter its halt loop. Table 4–30 I/O Reset Reason Code Descriptions Decimal Value...
4.8.1 Console Communications Area The console communications area (CCA) is the main data structure used by the console to interface with the OpenVMS operating system. Table 4–31 describes the CCA components. Table 4–31 CCA Component Descriptions Parameter Size Description CCA size 2 bytes Size of the CCA in bytes.
Page 140
Table 4–31 (Cont.) CCA Component Descriptions Parameter Size Description Bootability 4 bytes Results of the bootstrap test. Written by the firmware. Field test results breakdown by bit: • 00 = CPU/ATM check. Set when the CPU and ATM are good. •...
Table 4–31 (Cont.) CCA Component Descriptions Parameter Size Description Diagnostic 8 bytes Results of the diagnostic tests. Initialized by firmware. status Breakdown of the status fields: • [07:00] = Error number • [15:08] = Subtest number • [23:26] = Test number •...
• CPU module ID EEPROM: Valid checksum OpenVMS and firmware status byte is good Module ID and module name compatible with other zone Module hardware revision compatible with other zone (major) Firmware and software revisions compatible with other zone (major) •...
Table 4–32 (Cont.) Duplex Compatibility Test Failure Codes Failure Code Bit Number Code Description CPU ID EEPROM software revision (major) mismatches between zones ATM ID EEPROM is bad ATM ID EEPROM OpenVMS status field shows module is bad ATM ID EEPROM firmware status field shows module is bad ATM ID EEPROM module type field mismatches between zones ATM ID EEPROM module name field mismatches between zones ATM ID EEPROM hardware revision (major) mismatches between...
Table 4–33 Dispatch Block Components Block Content Offset Description Dispatch reason Base + 00h Code identifying reset reason. Bytes 03:02 code 4 bytes identify the reason for the reset. Bytes 01:00 identify the end action to be taken by the console as specified below: •...
Table 4–35 BPB Entry Components Component Length Description Unit number 2 bytes Device unit number. Valid numbers are in the 0 to 999 (decimal) range. Device 2 bytes Device name in ASCII (that is, EP and DI). Path identifier 1 byte Path to device.
Page 146
Table 4–37 (Cont.) DCB Entry Components Component Length Description Status 1 byte Module status summary. This field is a summary of the summary OpenVMS and firmware status fields. The field should be updated whenever OpenVMS or firmware status fields are updated.
Table 4–37 (Cont.) DCB Entry Components Component Length Description Ethernet 32 bytes Module Ethernet address. Follows the DEC STD format. address Valid only for CPU module and LANCE adapter card. Copied from the Ethernet EEPROM by firmware for the CPU. Copied from the LANCE ROM for the LANCE adapter card.
Figure 4–15 SubDCB Links to DCB SubDCB for DCB Entry 1 Number of Entries DCB Entry 1 DCB Entry 2 CCA Base + Offset DCB Entry n−1 Zone A DCB Offset DCB Entry n Zone B DCB Offset Zone A DCB Number of Entries DCB Base + Offset...
Table 4–38 CPU SubDCB Components Component Length Description Number of 4 bytes Number of entries in the SubDCB. Initialized by firmware. entries Is 0 if no entries are present. SubDCB 16 bytes An entry describes an MMB found by the firmware. entries per entry Initialized by firmware.
4.9.2 CPU/MEM Fault End Action Error Log Entry V A X / V M S SYSTEM ERROR REPORT COMPILED 3-FEB-1993 09:33:46 PAGE 56. ******************************* ENTRY 701. ******************************* ERROR SEQUENCE 1048. LOGGED ON: SID 17000002 DATE/TIME 2-FEB-1993 18:16:21.40 SYS_TYPE 02010101 SYSTEM UPTIME: 0 DAYS 01:48:21 SCS NODE: SIXSHL VAX/VMS T5.5-D34 INT60 ERROR KA560 CPU FW REV# 2.
Page 154
V A X / V M S SYSTEM ERROR REPORT COMPILED 3-FEB-1993 09:33:46 PAGE 57. Fault Data Block END ACTION SYSFLT 30020020 I/O error, zone B CPU/memory fault, zone B XLINK MODE = Duplex SYSADR 61200034 SYSADR = 61200034(X) CNTRL/STAT REG 00000008 System errors enabled DIAG_P REG CAC08000...
Page 155
V A X / V M S SYSTEM ERROR REPORT COMPILED 3-FEB-1993 09:33:46 PAGE 58. BIU CTL DFE0DEF9 Generate/Expect ECC on check_h pins output enable of cache rams direct mapped 2X CPU Cycle IO Map = 1(X) 512 Kbytes BC TAG 07913800 tag_match tag control V...
4.9.3 CPU or Zone Unsynchable Error Log Entry V A X / V M S SYSTEM ERROR REPORT COMPILED 3-FEB-1993 09:33:46 PAGE 56. ******************************* ENTRY 743. ******************************* ERROR SEQUENCE 1099. LOGGED ON: SID 17000002 DATE/TIME 2-FEB-1993 18:16:21.40 SYS_TYPE 02010101 SYSTEM UPTIME: 0 DAYS 01:48:21 SCS NODE: SIXSHL VAX/VMS T5.5-D34 INT60 ERROR KA560 CPU FW REV# 2.
Page 157
V A X / V M S SYSTEM ERROR REPORT COMPILED 3-FEB-1993 09:33:46 PAGE 57. CUP or ZONE UNSYNCHABLE EVENTS COMPAR/STAT REG 02000000 CPU is in burnin mode DIAG STATUS REG FFFFFFFF Diagnostic status is valid DIAG ERR NUM DIAG ERR NUM = 255 DIAG SUBTEST NUM DIAG SUBTEST NUM = 255 DIAG TEST NUM...
FRU Removal and Replacement Procedures 5.1 In This Chapter This chapter includes: • Field replaceable unit list • Before you begin • FRU removal and replacement 5.2 Field Replaceable Unit List A complete list of field replaceable units (FRUs) is given in Table 5–1. Table 5–1 Model 810 FRUs Part Number Modules:...
Page 160
Table 5–1 (Cont.) Model 810 FRUs Part Number Control and miscellaneous power module (CAMP) 54-21073-01 Options: Ethernet interface module (EIM) 54-21081-01 DSSI extender module 54-21063-01 DSSI interface module (DIM) 54-21065-01 DSSI disk drawer assembly 70-30569-01 Storage: 18.2 Gbyte magazine tape subsystem TF857-AA/AB 2.6 Gbyte cartridge tape drive TF85C-BA...
5.3 Before You Begin Warning Hazadous voltages exist within the system. Bodily injury or equipment damage can result when service procedures are performed incorrectly. Note FRUs should be handled only by qualified maintenance personnel. You do not need to shut down the entire system to remove and replace a FRU. You can shut down the zone that houses the faulty FRU while the other zone continues to operate.
5.3.1 Handling FRUs Static electricity can damage FRUs. When you handle FRUs, follow the rules in Table 5–2. Table 5–2 Handling FRUs Rule Action Wear an electrostatic discharge (ESD) wrist strap. When possible, use a grounded ESD workmat. Attach both the wrist strap and the workmat to the system chassis. Before you remove the FRU from the antistatic box, be sure you ground the box to the system chassis.
Example 5–1 How to Shut Down a Zone $ SHOW ZONE ! Displays the status of each zone. Zone A is ACTIVE ! Zone A is running. Zone B is PROVIDING I/O ONLY ! Zone B has a faulty component. $ STOP/ZONE B ! Stops zone B.
Figure 5–1 Latches Latch Location Expander CPU Cabinet Cabinet Expander CPU Cabinet Cabinet Front View Rear View M R - 0 4 5 7 - 9 2 D G 5.4 FRU Removal and Replacement The following sections contain FRU removal and replacement procedures. Caution Service procedures may be performed only by qualified personnel.
5.4.1 CPU and ATM Modules You use the same steps to remove the CPU and ATM modules. Figure 5–2 shows the locations of the modules. Table 5–3 describes the removal procedure. Figure 5–2 CPU Module and ATM Module Locations Captive Screws Module Release...
5.4.2 SIMMs Figure 5–3 shows the locations of the SIMMs. Table 5–4 describes the removal procedure. Note SIMMs are configured on the MMBs in rows, with a pair of SIMMs (two) in each row. You always replace a pair of SIMMs (a two-SIMM row). Figure 5–3 SIMM Locations Retaining Clip...
Table 5–5 (Cont.) MMB Removal Procedure Step Action Remove the three screws that secure each of the mounting brackets on the MMB. Note the configuration of the SIMMs on the MMB. They must be removed from the faulty MMB and installed in the same locations on the replacement MMB. Remove the SIMMs from the MMB using the procedure in Table 5–4.
Table 5–6 Fan and FCSB Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the rear door of the cabinet. Set the FEU circuit breaker to the off position. Open the front door of the cabinet.
5.4.5 RF35 Disk Drive Removal and Replacement Figure 5–7 shows an RF35 disk drive in the DSSI disk drawer. Table 5–7 describes the RF35 disk drive removal procedure. Figure 5–7 RF35 Disk Drive Location Release Lever Bracket Phillips Screws (6) Captive Screws (4) Release...
Table 5–7 RF35 Disk Drive Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the front door of the cabinet. Turn off the RF35 disk drive. Loosen the four screws that secure the DSSI disk drive rack in the CPU cabinet.
5.4.6 DSSI Disk Drawer Figure 5–7 shows the components in the DSSI disk drawer. Table 5–8 describes the DSSI disk drawer removal procedure. Table 5–8 DSSI Disk Drawer Removal Procedure Step Action Ask the operator or system manager to dismount the drive. Open the rear door of the cabinet.
Figure 5–8 Zone Control Panel Captive Screws Zone Control Panel Bracket Signal Cable Controller Module Handle Phillips Screws (6) Captive Screws MR−0023−93RAGS Table 5–9 Zone Control Panel Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2.
5.4.8 FEU, 3.3V Regulator, 5V Regulator, PSC Modules You use the same steps to remove these four FRUs. Figure 5–9 shows the locations of the modules. Table 5–10 describes the removal procedure. Figure 5–9 FEU, 3.3V Regulator, 5V Regulator, and PSC Locations +3.3V Regulator +5V Regulator Rear...
Caution Removing/replacing these four modules without shutting down 48V_DRCT may cause damage to the power components. Table 5–10 FEU, 3.3V Regulator, 5V Regulator, and PSC Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2.
5.4.9 Cross-Link Assembly Figure 5–10 shows the location of the cross-link assembly. Table 5–11 describes the removal procedure. Figure 5–11 shows you how to use the module extraction tool. Figure 5–10 Cross-Link Assembly Rear Upper Retaining Crosslink Module Middle Retaining Crosslink Cable Upper...
Table 5–11 Cross-Link Assembly Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the rear door of the cabinet. Remove the four screws from the upper retaining bar. Remove the four screws from the middle retaining bar.
Table 5–12 Console Extender Module Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the rear door of the cabinet. Remove the four screws from the upper retaining bar. Remove the four screws from the middle retaining bar.
Table 5–13 DSSI Extender Module Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the rear door of the cabinet. Remove the four screws from the upper retaining bar. Remove the four screws from the middle retaining bar.
5.4.12 CAMP Module Figure 5–15 shows the locations of the CAMP modules. Table 5–14 describes the removal procedure. Caution Removing/replacing the CAMP module without shutting down 48V_DRCT may cause damage to the CAMP module. Figure 5–15 CAMP Module Locations Rear CAMP Module CPU Cabinet...
Table 5–14 CAMP Module Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the rear door of the cabinet. Set the FEU circuit breaker to the off position. Remove the four screws from the upper retaining bar.
5.4.13 DSSI Interface Module (DIM) Figure 5–16 shows the location of the interface logic modules. Figure 5–17 shows how to remove the DIMs. Table 5–15 describes the removal procedure. Figure 5–16 DIM Location Rear Middle Interface Retaining Logic Modules (DIMs and EIMs) Lower Retaining CPU Cabinet...
Figure 5–17 DIM Removal Rear Connector DSSI Cable CPU Cabinet Expansion Cabinet MR−0046−93RAGS Table 5–15 DIM Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the rear door of the cabinet. Remove the four screws from the middle retaining bar.
Table 5–16 EIM Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Open the rear door of the cabinet. Remove the four screws from the middle retaining bar. Remove the four screws from the lower retaining bar.
5.4.16 TF85C-BA Tape Drive Figure 5–19 and Figure 5–20 show how to remove an TF85C-BA tape drive from the system. Table 5–18 describes the removal procedure. Warning Two people are required to lift and carry the TF85C-BA tape drive enclosure. Figure 5–19 TF85C-BA Tape Drive, Rear View Power Supply Fault Indicator...
Figure 5–20 TF85C-BA Tape Drive Removal Tape Drive Enclosure Release Tab Front Plate Screws (4) Screws (3) TF85 Tape Drive Front Plate M R - 0 0 3 8 - 9 3 D G Table 5–18 TF85C-BA Tape Drive Removal Procedure Step Action Ask the operator or system manager to dismount the tape.
5.4.17 SF73 Disk Drive Figure 5–21 and Figure 5–22 show how to remove the SF73 disk drives from the system. Figure 5–23 shows how to remove an SF73 disk drive enclosure from the system. Figure 5–24 shows how to remove an SF73 disk ISE from a drive. Table 5–19 describes the removal procedure.
Figure 5–22 SF73 Disk Drive, Front View d i g i t a l Write DSSI DSSI Write Ready Protect Fault Ready Fault Protect C a p t i v e S c r e w s F r o n t C o v e r D o o r C a p t i v e S c r e w s M R - 0 0 3 5 - 9 3 D G...
Figure 5–24 SF73 Disk ISE Removal N O T E T O I L L U S T R A T O R : f r o n t p a n e l f o r t h i s h a r d w a r e i s S H R _ X 1 1 2 7 _ 8 9 I S O L a n d r e d u c e d 1 7 / 6 4 ( .
5.4.18 SF35 Storage Array Figure 5–23 shows how to remove an SF35 storage array from the system. Figure 3–7 and Figure 5–26 show the rear and front views of the SF35 storage array. Figure 5–27 shows how to remove an SF35 disk ISE from the storage array.
Figure 5–26 SF35 Storage Array, Front View Operator Control Panel (OCP) F r o n t R e a r F r o n t R e a r R e a d y W r i t e P r o t e c t F a u l t F r o n t R e a r...
Figure 5–27 SF35 Disk ISE Removal r i t Carrier Lever Screw Carrier Lever M R - 0 0 3 3 - 9 3 D G Table 5–20 SF35 Storage Array Removal Procedure Step Action Ask the operator or system manager to dismount the disk. Turn off the storage array.
5.4.19 TF857-CA Tape Drive Figure 5–28 shows how to remove the TF857-CA tape drive from the system. Table 5–21 describes the removal procedure. Warning Two people are required to lift and carry the TF857-CA tape drive enclosure. Figure 5–28 TF857-CA Tape Drive, Rear View DSSI Cable Cable Clip...
Table 5–21 TF857-CA Tape Drive Removal Procedure Step Action Ask the operator or system manager to shut down the zone using the procedure in Section 5.3.2. Ask the operator or system manager to dismount the tape drive. Unload the tape magazine, if one is present. At the front of the drive, set the power switch to off (0).
Note If you are replacing the TF857 tape loader, you must set the node ID. Refer to Figure 5–30 for the node ID DIP switch location. Figure 5–30 Setting the TF857 Tape Loader Node ID Node ID DIP Switch Drive Enclosure Controller Module TF857 Tape Drive Assembly...
5.4.20 Power Distribution Box Figure 5–31 shows a domestic power distribution box. Figure 5–32 shows an international power distribution box. Table 5–22 describes the removal procedure. Figure 5–31 Domestic Power Distribution Box AC Power Outlets (8) Screws Circuit Breaker AC Power DEC Power Bus Cable Switch...
Figure 5–32 International Power Distribution Box AC Power Outlets (6) Screws AC Power Connector Circuit Breaker DEC Power Bus Switch Access Hole Screws M R - 0 0 4 5 - 9 3 D G Table 5–22 Power Distribution Box Removal Procedure Step Action Turn off any devices connected to the power distribution box.
Managing Integrated Storage Elements 6.1 In This Chapter This chapter includes: • Loading the DUP driver • Using VMS DUP • Using the server setup switch • Assigning DSSI unit numbers • Warm swapping an ISE 6.2 Loading the DUP Driver If the VMS diagnostic utility protocol (DUP) class driver is not loaded, load it as follows: $ MCR SYSGEN...
To stop the PARAMS utility, press , or type EXIT at the Ctrl/C Ctrl/Y Ctrl/Z PARAMS prompt. Table 6–1 lists PARAMS commands. Table 6–1 PARAMS Commands Command Description EXIT Stops the PARAMS utility HELP Displays information on how to use PARAMS commands Changes internal ISE parameters SHOW Displays the setting of a parameter or a class of parameters...
Figure 6–1 VAXft Model 810 Front View Front SF73 Expansion Cabinet SF35 CPU Cabinet MR−0050−93RAGS 6.6 Warm Swapping an ISE Warm swapping is the procedure by which an ISE can be replaced or added to a running system without interrupting system operations. Caution The procedure must be followed carefully.
Figure 6–2 VAXft Model 810 Rear View Rear SF73 CPU Cabinet Expansion Cabinet SF35 MR−0051−93RAGS • Replacement in a system that is running • Installation in a system that is running When replacing an ISE or installing a new ISE, determine the parameter values for the ISE before performing the warm swap procedure.
6.6.1 Setting ISE Parameters Digital Equipment Corporation recommends maintaining a worksheet of the parameters for all ISEs, as well as the serial number of each ISE. This is especially important at sites that maintain a set of spare drives that may be stored for some time before they are used.
Page 208
The node name is shown in parentheses. In the following sample output, the node names are RIRRBA and RICYAA. • ALLCLASS The allocation class is found in the device name between the dollar signs ($). In $1$DIA21, the ISE has an allocation class of 1. If the allocation class was 0, the node name would display as RICYAA$DIA21.
Caution You must use an ESD wrist strap, ground clip, and grounded ESD workmat whenever you handle ISEs. Use the static protective service kit (PN 29-262446). Use great care when you handle an ISE; excessive shock can damage the head-disk-assembly (HDA). 1.
1. Disable the MSCP server as described in Table 6–4. Table 6–4 Disabling the MSCP Disks Action RF-series Press and hold the SU switch/button SF72 or SF72- Set the MSCP enable switch series SF35 Press the MSCP/Fault switch (LED is green when enabled) 2.
Page 212
For example: $ SET HOST/DUP/SERVER=MSCP$DUP/TASK=PARAMS R1QSAA Return %HSCPAD-I-LOCPROGEXE, Local program executing - type ^\ to exit Copyright (C) 1993 Digital Equipment Corporation PARAMS> SHOW NODENAME Return Parameter Current Default Type Radix ---------- ------------- -------------- ---------- --------- NODENAME R1QSAA RF31 String Ascii PARAMS>...
Note The SHOW CLUSTER command continues to show the name of the ISE replaced. This does not harm the system. After the next reboot, the replacement ISE name appears. Note also that the following message is displayed if another node is already assigned the same SYSTEMID and NODENAME: %PWA0-REMOTE SYSTEM CONFLICTS WITH KNOWN SYSTEM In this case, shut down the new node and issue a unique SYSTEMID and...
Page 214
NODENAME — DISK22 • SYSTEMID — no change • UNITNUM — 22 $ SET HOST/DUP/SERVER=MSCP$DUP/TASK=PARAMS R1QSAA Return %HSCPAD-I-LOCPROGEXE, Local program executing - type ^\ to exit Copyright (C) 1990 Digital Equipment Corporation PARAMS> SHOW NODENAME Return Parameter Current Default Type Radix...
Page 215
When initialization is complete, the new ISE and its parameters are made available to the VMS operating system. 8. On SF-series drives, enable the MSCP switch. Note The SHOW CLUSTER command continues to show the name of the ISE you replaced. This does not harm the system. After the next reboot, the new ISE name appears.
Miscellaneous System Information A.1 In This Appendix This appendix includes: • Processor Halt codes • Console Halt codes • Error register descriptions • I/O physical address space • System control block description A.2 Processor Halt Codes Table A–1 provides the processor Halt code definitions. Table A–1 Processor Halt Code Definitions Halt Code Number...
Table A–1 (Cont.) Processor Halt Code Definitions Halt Code Number Definition CPM$K_PSL_EXC7 PSL [26:24] = 111 during interrupt or exception CPM$K_PSL_REI5 PSL [26:24] = 101 during REI CPM$K_PSL_REI6 PSL [26:24] = 110 during REI CPM$K_PSL_REI7 PSL [26:24] = 111 during REI The following example shows a processor Halt code output.
Table A–2 (Cont.) Processor Halt Reason Code Definitions Reason Code (Hex) Definition 0016 A VAXELN kernel fatal error has occurred 0017 Initializing VAXELN before starting reconfiguration A.3 Console Halt Codes The following example shows a console Halt code output. Table A–3 defines the Halt Reason fields.
Table A–3 (Cont.) Console Halt Reason Code Definitions Reason Code (Hex) Definition 0015 Unexpected VAXELN error occurred 0016 A VAXELN kernel fatal error has occurred 0017 Initializing VAXELN before starting reconfiguration A.4 Error Register Descriptions A.4.1 System Fault (SYSFLT) Register This register is not rail or zone unique (Figure A–1).
Page 221
Table A–4 (Cont.) Xlink Mode Coding Code Mode Resync Slave Resync Master Not Used [27:26]: - Not used. [25]: LCK - Lock. Latched when an error occurs during an interlock I/O access. (Interlock access refers to the special I/O access mode.) [24]: RSA - Resync Abort.
Page 222
a two-zone system to diverge. Hardware generates an IPL29 interrupt to both zones within three clock cycles. [12]: MSA - Memory Single-Bit Error (Zone A). Set when a single-bit ECC error is detected in memory during a read and the JXD was not the requester of the data.
A.4.2 System Error Address (SYSADR) Register This register latches when any error is detected at the JXD Jet Bus and below (Figure A–2). It contains the address the CPU was accessing at the time the error occurred. The register is read only and cleared by clearing errors. All bits in this register have the following characteristics: default = 0, type = ro, reset = hr.
Register Address: CPU = E110 1040 (CCA_BASE+180) [31:30]: DL - DMA data length: 00 - Hexword 01 - Longword 10 - Quadword 11 - Octaword [29:00]: DEA - DMA 30-bit address latched during error. A.4.4 Reset Reason 0013 Fault Analysis The following example shows the content of the SYSFLT and SYSADR registers after a Reset Halt.
A.6 System Control Block Description The System Control Block (SCB) contains vectors for servicing interrupts and exceptions. The SCB address should be aligned on a page boundary. The SCB address is contained in the System Control Block Base register (SCBB) (Figure A–5).
Table A–5 (Cont.) Code Field Definition Code Definition The event is to be serviced on the interrupt stack. If the event is an exception, the IPL is raised to 1F (hex). Unimplemented, results in a console error halt. Unimplemented, results in a console error halt. The SCB content is specified in Table A–6.
Page 228
Table A–6 (Cont.) SCB Layout Vector Name Type Parameter Notes CHMS Trap Parameter is sign- extended operand word CHMU Trap Parameter is sign- extended operand word Unused — — — Soft error notification Interrupt IPL is 1A (hex) 58 to 5C Unused —...
ISE Parameter Worksheets B.1 In This Appendix This appendix includes: • Individual ISE parameter worksheets • ISE zone parameter worksheets B.2 Individual ISE Parameter Worksheets Use the following worksheets to record parameters for each ISE. Serial Number: NODENAME: SYSTEMID: ALLCLASS: UNITNUM: FORCEUNI: FORCENUM:...
Page 230
Serial Number: NODENAME: SYSTEMID: ALLCLASS: UNITNUM: FORCEUNI: FORCENUM: Serial Number: NODENAME: SYSTEMID: ALLCLASS: UNITNUM: FORCEUNI: FORCENUM: Serial Number: NODENAME: SYSTEMID: ALLCLASS: UNITNUM: FORCEUNI: FORCENUM: MR−0053−93RAGS B–2 ISE Parameter Worksheets...
B.3 ISE Zone Parameter Worksheets Use the following worksheets to record parameters for each ISE. Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME: NODENAME:...
Page 232
Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME: NODENAME: UNITNUM: UNITNUM: Serial No: Serial No: NODENAME:...
Page 233
Index Console commands (cont’d) START, 2–19 TEST, 2–20, 3–30 Application of thresholds, 4–17 X, 2–21 ATM module Z, 2–22, 3–31 removal and replacement, 5–7 Console communications area data structures, ATM module deconfiguration actions, 4–13 4–55 Console extender module removal and replacement, 5–20 Controls and indicators Before you begin, 5–3 disk drawer, 3–19...
Page 234
Documentation road map, iii FRUs, 4–12 DSSI cable access, 5–5 removal and replacement, 5–29 FTSS event reporting interface, 4–40 DSSI disk drawer removal and replacement, 5–14 DSSI extender module General troubleshooting procedure removal and replacement, 5–22 system maintenance, 3–4 DSSI interface module removal and replacement, 5–26 DUP, 6–1 PARAMS utility, 6–1...
Page 235
RF35 disk drive removal and replacement, 5–12 ROM-based diagnostics Page frame number bitmap system diagnostics, 3–29 data structures, 4–65 POST, 3–27 Power distribution box removal and replacement, 5–42 SCB description, A–10 Power distribution boxes Server setup switch, 6–2 system component descriptions, 1–9 Services Power modules, 3–12 error handling, 4–1...
Page 236
Threshold information block, 4–26 fault data, 4–30 TK85C-BA cartridge tape drive indicators, 3–27 VAXELN error handling, 4–10 Unit number assignment, 6–2 Warm swapping, 6–3 Unsynchable events fault data, 4–36 Z command system diagnostics, 3–31 5V regulator Zone control panel removal and replacement, 5–16 removal and replacement, 5–14 3.3V regulator system component descriptions, 1–6...
Need help?
Do you have a question about the VAXft Systems 810 and is the answer not in the manual?
Questions and answers