Digital Equipment 7300 Series Service Manual

Hide thumbs Also See for 7300 Series:
Table of Contents

Advertisement

Quick Links

DIGITAL Server 7300/7300R Series
Service Manual
Part Number: EK-K9FWW-SG. A01
This manual is for anyone who services a DIGITAL Server 7300/7300R Series system.
It covers installation, power-up, initial troubleshooting, and component installation.
January 1998
Digital Equipment Corporation
Maynard, Massachusetts

Advertisement

Table of Contents
loading

Summary of Contents for Digital Equipment 7300 Series

  • Page 1 DIGITAL Server 7300/7300R Series Service Manual Part Number: EK-K9FWW-SG. A01 This manual is for anyone who services a DIGITAL Server 7300/7300R Series system. It covers installation, power-up, initial troubleshooting, and component installation. January 1998 Digital Equipment Corporation Maynard, Massachusetts...
  • Page 2 The software, if any, described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software or equipment that is not supplied by Digital Equipment Corporation or its affiliated companies.
  • Page 3: Table Of Contents

    Table of Contents 1 System Overview DIGITAL Server 7300/7300R System Drawer (BA30A) ............1–3 Cover Interlocks ......................1–4 Cabinet System........................1–6 Cabinet Differences ....................... 1–7 Cabinet System Fan Tray....................1–7 Pedestal System........................1–8 Control Panel and Drives....................1–10 System Consoles......................... 1–12 SRM Console.......................
  • Page 4 Server Control Module ....................... 1–31 Power Control Module ....................... 1–33 Power Supply ........................1–35 2 Power-Up Control Panel ........................2–2 Power-Up Sequence....................... 2–4 SROM Power-Up Test Flow....................2–8 SROM Errors Reported ...................... 2–11 XSROM Power-Up Test Flow .................... 2–12 XSROM Errors Reported....................2–15 Console Power-Up Tests ....................
  • Page 5 Power-Up/Down Sequence ....................4–8 Cabinet Power Configuration Rules..................4–10 Pedestal Power Configuration Rules (North America and Japan) ........4–12 Pedestal Power Configuration Rules (Europe and Asia Pacific) .......... 4–14 5 Error Detection with Error Registers Overview of Error Detection....................5–2 Error Registers........................
  • Page 6 System Bus to PCI Bus Bridge (B3040-AA) Module Removal and Replacement ....6–23 System Motherboard Removal and Replacement..............6–25 PCI/EISA Motherboard (B3050/B3052) Removal and Replacement........6–27 Server Control Module Removal and Replacement ............6–29 PCI/EISA Option Removal and Replacement ..............6–31 Power Supply Removal and Replacement................
  • Page 7 Resetting the RCM to Factory Defaults................ 9–18 Troubleshooting Guide ....................9–19 Modem Dialog Details....................9–22 Figures Figure 1-1 Components of the BA30A System Drawer ....... 1–3 Figure 1-2 Cover Interlock Circuit ............... 1–5 Figure 1-3 DIGITAL Server 7300/7300R Cabinet System ......1–6 Figure 1-4 Cabinet Fan Tray ................
  • Page 8 Figure 4-7 Pedestal Power Distribution (N.A. and Japan) ......4–12 Figure 4-8 Pedestal Power Distribution (Europe and AP)......4–14 Figure 5-1 Error Detector Placement ............5–2 Figure 6-1 System Drawer FRU Locations........... 6–3 Figure 6-2 Location of Power System FRUs ..........6–9 Figure 6-3 Exposing System Drawer (H9A10-EN &...
  • Page 9 Table 2-4 IOD Tests .................. 2–17 Table 2-5 PCI Motherboard Tests (B3050/B3052) ........2–18 Table 3-1 Power Control Module LED States ..........3-8 Table 5-1 External Interface Status Register ..........5–8 Table 5-2 Loading and Locking Rules for External Interface Registers..5–11 Table 5-3 MC Error Information Register 0 ..........
  • Page 10 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 11 Preface Document Audience This manual is written for the customer service engineer. Document Structure This manual uses a structured documentation design. Topics are organized into small sections for efficient online and printed reference. Each topic begins with an abstract, followed by an illustration or example, and ends with descriptive text. This manual has nine chapters, as follows: •...
  • Page 12 • Chapter 8, SRM Console Commands and Environment Variables, summarizes the commands used to examine and alter the system configuration. • Chapter 9, Operating the System Remotely, describes how to use the remote console monitor (RCM) to monitor and control the system remotely. Documentation Titles The following table lists titles related to DIGITAL Server 7300/7300R series systems.
  • Page 13: System Overview

    System Overview This chapter introduces the DIGITAL Server 7300/7300R series systems. These systems are available in cabinets or pedestals. The pedestal system has one system drawer and up to three StorageWorks shelves. The cabinet system can have a combination of system drawers and StorageWorks shelves that occupy the five sections of the cabinet.
  • Page 14 System Overview • Memory Modules • System Bus • System Bus to PCI Bus Bridge Module • PCI I/O Subsystem • Server Control Module • Power Control Module • Power Supply 1–2 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 15: Digital Server 7300/7300R System Drawer (Ba30A)

    System Overview DIGITAL Server 7300/7300R System Drawer (BA30A) Components in the BA30A system drawer are located in the system bus card cage, the PCI card cage, the control panel assembly, and the power and cooling section. The drawer measures 30 cm x 45 cm (11.8 in. x 17.7 in.) and fully configured weighs approximately 45.5 kg (~100 lbs).
  • Page 16: Cover Interlocks

    System Overview š PCI/EISA card cage, which holds the PCI motherboard, option cards, and server control module. › Server control module, which holds the I/O connectors and remote console monitor. œ Control panel assembly, which includes the control panel, a floppy drive, and a CD- ROM drive.
  • Page 17: Figure 1-2 Cover Interlock Circuit

    System Overview Figure 1-2 Cover Interlock Circuit 3 Interlock 17-04217-01 Logic Switches Power Supply Cover Interlocks 17-04201-01 70-32016-01 17-04302-01 Motherboard DC_ENABLE_L 32016- B3040 B305n Switch 17-04196-01 POWER_FAULT_L To OCP 17-04201-02 RSM_DC_EN_L LJ-06315 NOTE: The cover interlocks must be engaged to enable power-up. To override the cover interlocks, find a suitable object to close the interlock circuit.
  • Page 18: Cabinet System

    System Overview Cabinet System The DIGITAL Server 7300/7300R series cabinet system can accommodate multiple systems in a single cabinet. There are two cabinet variations that can hold different system configurations. From the outside, the cabinets look almost identical and are of one basic type.
  • Page 19: Cabinet Differences

    System Overview Cabinet Differences Cabinet Power Mounting Destination H9A10-EN Two 120 volt Pull-out tray North America H7600-AA power (max drawers: 3) Asia Pacific controllers H9A10-EP Two 240 volt Pull-out tray Europe H7600-DB power (max drawers: 3) controllers Cabinet System Fan Tray At the top of cabinet systems is a fan tray containing three exhaust fans, a small 12-volt power supply, and a module that distributes power to the server control module in each drawer.
  • Page 20: Pedestal System

    System Overview Pedestal System The pedestal system contains one system drawer with a control panel, a CD-ROM drive, and a floppy drive. In the pedestal control panel area there is space for an optional tape or disk drive. Three StorageWorks shelves provide up to 90 Gbytes of in-cabinet storage. Figure 1-5 Pedestal System Front PK-0301-96 In the pedestal system, the control panel is located at the top left in a tray.
  • Page 21: Figure 1-6 Pedestal System Rear

    System Overview Figure 1-6 Pedestal System Rear PK-0307a-96 DIGITAL Server 7300/7300R Series Service Manual 1–9...
  • Page 22: Control Panel And Drives

    System Overview Control Panel and Drives The control panel includes the On/Off, Halt, and Reset buttons and a display. In a pedestal system the control panel is located in a tray at the top of the system drawer. In a cabinet system, the control panel is at the bottom of the system drawer with the CD-ROM drive and the floppy drive.
  • Page 23 System Overview missing, regardless of the position of the On/Off button. š Halt button. Pressing this button in (so the LED at the top of the button is on) has no effect on Windows NT. If the Halt button is in when the system is reset or powered up, the system halts in the SRM console.
  • Page 24: System Consoles

    System Overview System Consoles There are two console programs: the SRM console and the AlphaBIOS console. SRM Console The SRM console is a command-line interface that tests the system after power-up or reset and launches the AlphaBIOS graphical interface. For some configuration and diagnostic or testing tasks, you may need to use the SRM console interface rather than launch the AlphaBIOS console.
  • Page 25: Environment Variables

    System Overview Figure 1-8 AlphaBIOS Boot Menu AlphaBIOS Version 5.12 Please select the operating system to start: Windows NT Server 3.51 to move the highlight to your choice. Press Enter to choose. Alpha Press <F2> to enter SETUP PK-0728-96 Environment Variables Environment variables are software parameters that, among other things, define the system configuration.
  • Page 26: System Architecture

    System Overview System Architecture Alpha microprocessor chips are used in these systems. The CPU, memory and the I/O bridge module are connected to the system bus motherboard. Figure 1-9 Architecture Diagram Memory CPU 0 Pairs System Bus 128-Bit Data Bus + 16 ECC and 40-Bit Command/Address Bus Bridge System to System to...
  • Page 27 System Overview DIGITAL Server 7300/7300R series systems use the Alpha chip for the CPU. The CPU, memory, and I/O bridge module to PCI/EISA I/O buses are connected to the system bus motherboard. A fourth type of module, the power control module, also plugs into the system motherboard.
  • Page 28: System Motherboard

    System Overview System Motherboard The system motherboard is on the floor of the system card cage. It has slots for the CPU, memory, power control, and bridge modules. Figure 1-10 System Motherboard Module Locations PK-0703D-96 1–16 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 29 System Overview The system motherboard has the logic for the system bus. It is the backplane that holds the CPU, memory, bridge, and power control modules. Figure 1-10 shows a diagram of the motherboard used in DIGITAL Server 7300/7300R series Server systems.
  • Page 30: Cpu Types

    System Overview CPU Types DIGITAL Server 7300/7300R series systems can be configured with one of two CPU variants. CPU Variants Module Variant Clock Frequency Onboard Cache B3105-AA 400 MHz 4 Mbytes B3105-CA 533 MHz 4 Mbytes CPU Module Layout Figure 1-11 shows the layout of the CPU module. Figure 1-11 CPU Module Layout System Motherboard CPU Module Slots...
  • Page 31: Alpha Chip Composition

    System Overview Alpha Chip Composition The Alpha chip is made using state-of-the-art chip technology, has a transistor count of 9.3 million, consumes 50 watts of power, and is air cooled (a fan is on the chip). The default cache system is write-back and when the module has an external cache, it is write-back. Chip Description Unit Description...
  • Page 32: Memory Modules

    System Overview Memory Modules Memory modules are used only in pairs — two modules of the same size and type. Each module provides either the low half or the high half of the memory space. The 7300/7300R series system drawer can hold up to four memory module pairs. Figure 1-12 Memory Module Layout Typical S ynchronous M em ory Typica l E D O M em ory...
  • Page 33: Memory Variants

    System Overview Memory Variants Each memory option consists of two identical modules. Each DIGITAL Server 7300/7300R series drawer supports up to four memory options, for a total of 4 Gbytes of memory. Memory modules are used only in pairs and are available in 128 Mbyte, 512 Mbyte, 1 Gbyte, and 2 Gbyte sizes.
  • Page 34 System Overview • The largest memory pair must be in slots MEM 0L and MEM 0H. • Other memory pairs must be the same size or smaller than the first memory pair. • Memory pairs must be installed in consecutive slots. 1–22 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 35: Memory Addressing

    System Overview Memory Addressing Alpha system memory addressing is unusual because memory address space is determined not by the amount of physical memory but is calculated by a multiple of the size of the memory pair in slot MEM0x. Figure 1-13 How Memory Addressing Is Calculated 2028 M byte Fourth pair address space 512 M byte space em pty...
  • Page 36 System Overview The rules for addressing memory are as follows: • Address space is determined by the memory pair in slot MEM0. • Memory pairs need not be the same size. • The memory pair in slot MEM0 must be the largest of all memory pairs. Other memory pairs may be as large but none may be larger.
  • Page 37: System Bus

    System Overview System Bus The system bus consists of a 40-bit command/address bus, a 128-bit plus ECC data bus, and several control signals and clocks. Figure 1-14 System Bus Block Diagram MEM3 MEM2 MEM1 MEM0 SIM_ADR DATA SYNC DRAMS CTRL MEM CTRL &...
  • Page 38 System Overview The system bus motherboard consists of a 40-bit command/address bus, a 128-bit plus ECC data bus, and several control signals, clocks, and a bus arbiter. The bus requires that all CPUs have the same high-speed oscillator providing the clock to the Alpha chip. The DIGITAL Server 7300/7300R series system bus connects up to four CPUs, four pairs of memory modules, and a single I/O bus bridge module.
  • Page 39: System Bus To Pci Bus Bridge Module

    System Overview System Bus to PCI Bus Bridge Module The bridge module is the physical interconnect between the system motherboard and any PCI motherboard in the system. Figure 1-15 Bridge Module PCI Bus Control AD<31:0> Address Control Data A to B bus ECC &...
  • Page 40 System Overview The system bus to PCI bus bridge module converts: • System bus commands and data addressed to I/O space to PCI commands and data • PCI bus commands and data addressed to system memory or CPUs to system bus commands and data.
  • Page 41: Pci I/O Subsystem

    System Overview PCI I/O Subsystem The I/O subsystem is PCI. The DIGITAL Server 7300/7300R series has two four-slot PCI buses that hold up to eight I/O options. One of these buses can be both PCI and EISA, but can hold not more than four options three of which may be EISA. Figure 1-16 PCI Block Diagram PCI-1 Bus SCSI Control...
  • Page 42 System Overview The logic for two PCI buses is on each PCI motherboard. PCI0 is a 64-bit bus with a built-in PCI to EISA bus bridge. PCI0 has one dedicated PCI slot and three slots, though there are six connectors, that can be PCI or EISA slots. Each slot has an EISA connector and a PCI connector only one of which may be used at a time.
  • Page 43: Server Control Module

    System Overview Server Control Module The server control module enables remote console connections to the system drawer. The module passes signals to COM ports 1 and 2, the keyboard, and the mouse to the standard I/O connectors. Figure 1-17 Server Control Module Standard I/O Remote Console Monitor...
  • Page 44 System Overview The server control module has two sections: the remote console monitor (RCM) and the standard I/O. See Chapter 9 for information on controlling the system remotely. The remote console monitor connects to a modem through the modem port on the bulkhead.
  • Page 45: Power Control Module

    System Overview Power Control Module The power control module controls power sequencing and monitors power supply voltage, temperature, and fans. Figure 1-18 Power Control Module System Motherboard Power Control Module Slot PK-0710-96 DIGITAL Server 7300/7300R Series Service Manual 1–33...
  • Page 46 System Overview The power control module performs the following functions: • Controls power sequencing. • Monitors the combined output of power supplies and shuts down power if it is not in range. • Monitors system temperature and shuts off power if it is out of range. •...
  • Page 47: Power Supply

    System Overview Power Supply The system drawer power supplies provide power only to components in the drawer. One or two power supplies are required, depending on the number of CPU modules and PCI card cages; a second or third can be added for redundancy. The power system is described in detail in Chapter 4.
  • Page 48 System Overview Description One to three power supplies provide power to components in the system drawer. (They supply power only for the drawer in which they are located.) Three power supplies provide redundant power in fully loaded DIGITAL Server 7300/7300R series systems. These power supplies share the load, and redundant configurations are supported.
  • Page 49: Power-Up

    Power-Up This chapter describes system power-up testing and explains the power-up displays. The following topics are covered: • Control Panel • Power-Up Sequence • SROM Power-Up Test Flow • SROM Errors Reported • XSROM Power-UP Test Flow • XSROM Errors Reported •...
  • Page 50: Control Panel

    Power-Up Control Panel The control panel display indicates the likely device when testing fails. Figure 2-1 Control Panel and LCD Display Potentiom eter A ccess H ole R eset H a lt O n/O ff P 0 T E S T 1 1 C P U 0 0 P K -07 0 6G -9 6 When the On/Off button LED is on, power is applied and the system is running.
  • Page 51: Table 2-1 Control Panel Display

    Power-Up Table 2-1 Control Panel Display Field Content Display Meaning CPU number P0–P3 CPU reporting status š Status TEST Tests are executing FAIL Failure has been detected MCHK Machine check has occurred INTR Error interrupt has occurred › Test number œ...
  • Page 52: Power-Up Sequence

    Power-Up Power-Up Sequence Console and most power-up tests reside on the I/O subsystem, not on the CPU nor on any other module on the system bus. Figure 2-2 Power-Up Flow X S R O M te sts execute Pow er-U p/R ese t S R O M code loa ded S R M con sole loaded into each C P U 's...
  • Page 53: Figure 2-3 Contents Of Feproms

    Power-Up XSROM. The XSROM, or extended SROM, contains back-up cache and memory tests, and a fail-safe loader. The XSROM code resides in sector 0 of FEPROM 0 on the XBUS. Sector 2 of FEPROM 0 contains a duplicate copy of the code and is used if sector 0 is bad. FEPROM.
  • Page 54: Figure 2-4 Console Code Critical Path

    Power-Up For the console to run, the path from the CPU to the XSROM must be functional. The XSROM resides in FEPROM0 on the XBUS, off the EISA bus, off PCI 0, off IOD 0. See Figure 2-4. This path is minimally tested by SROM. Figure 2-4 Console Code Critical Path M e m ory C P U...
  • Page 55 Power-Up The SROM contents are loaded into each CPU’s I-cache and executed on power-up/reset. After testing the caches on each processor chip, it tests the path to the XSROM. Once this path is tested and deemed reliable, layers of the XSROM are loaded sequentially into the processor chip on each CPU.
  • Page 56: Srom Power-Up Test Flow

    Power-Up SROM Power-Up Test Flow The SROM tests the CPU chip and the path to the XSROM. Figure 2-5 SROM Power-Up Test Flow Fo r e a c h C P U In itia lize C P U ch ip In itia lize Tu rn o ff C P U L E D P C I-E IS A b ridg e...
  • Page 57 Power-Up The Alpha chip built-in self-test tests the I-cache at power-up and upon reset. Each CPU chip loads its SROM code into its I-cache and starts executing it. If the chip is partially functional, the SROM code continues to execute. However, if the chip cannot perform most of its functions, that CPU hangs and that CPU pass/fail LED remains off.
  • Page 58 Power-Up Table 2-2 lists the tests performed by the SROM. Table 2-2 SROM Tests Test Name Logic Tested D-cache RAM March D-cache access, D-cache data, D-cache address logic test D-cache Tag RAM D-cache tag store RAM, D-cache bank address logic March test S-cache Data March S-cache RAM cells, S-cache data path, S-cache address path...
  • Page 59: Srom Errors Reported

    Power-Up SROM Errors Reported The SROM reports machine checks, pending interrupt/exception errors, and errors related to corruption of FEPROM 0. If SROM errors are fatal, the particular CPU will hang and only the CPU self-test pass LEDs and/or the LEDs on the system bus to PCI bus bridge module will indicate the failure.
  • Page 60: Xsrom Power-Up Test Flow

    Power-Up XSROM Power-Up Test Flow After the SROM has completed its tests and verified the path to the FEPROM containing the XSROM code, it loads the first 8 Kbytes of XSROM into the primary CPU’s S-cache and jumps to it. Figure 2-6 XSROM Power-Up Flowchart X S R O M b a n n e r to O C P /co n s o le d ev ice...
  • Page 61: Table 2-2 Xsrom Tests

    Power-Up After jumping to the primary CPU's S-cache, the code then intentionally I-caches itself and is completely register based (no D-stream for stack or data storage is used). The only D- stream accesses are writes/reads during testing. Each FEPROM has sixteen 64-Kbyte sectors. The first sector contains B-cache tests, memory tests, and a fail-safe loader.
  • Page 62: Table 2-3 Memory Tests

    Power-Up Table 2-3 Memory Tests Test Test Name Logic Tested Description Memory Data test Data path to and from 01 – FF Errors are memory reported as an 8-bit Data path on memory binary field. A set bit and RAMs indicates a module failure.
  • Page 63: Xsrom Errors Reported

    Power-Up XSROM Errors Reported The XSROM reports B-cache test errors and memory test errors. The XSROM also reports a warning if memory is illegally configured. Example 2-2 XSROM Errors Reported at Power-Up B-Cache Error (CPU Error) TEST ERR on cpu0 #CPU running the test cpu0 err#...
  • Page 64 Power-Up Sctr 1 -PAL headr CHKSM fail Sctr 1 -PAL code CHKSM fail Sctr 3 -CONSLE headr PTTRN fail Sctr 3 -CONSLE headr CHKSM fail Sctr 3 -CONSLE code CHKSM fail 2–16 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 65: Console Power-Up Tests

    Power-Up Console Power-Up Tests Once the SRM console is loaded, it does further testing of each IOD. Table 2-4 describes the IOD power-up tests, and Table 2-5 describes the PCI motherboard power-up tests. Table 2-4 IOD Tests Test Test Name Description Number IOD CSR Access test...
  • Page 66: Table 2-5 Pci Motherboard Tests (B3050/B3052)

    Power-Up Table 2-5 PCI Motherboard Tests (B3050/B3052) Test Test Name Diagnostic Description Number Name PCEB pceb_diag Tests the PCI to EISA bridge chip esc_diag Tests the EISA system controller 8K NVRAM nvram_diag Tests the NVRAM Real-Time Clock ds1287_diag Tests the real-time clock chip Keyboard and i8242_diag Tests the keyboard/mouse chip...
  • Page 67: Console Device Determination

    Power-Up Console Device Determination After the SROM and XSROM have completed their tasks, the SRM console program, as it starts, determines where to send its power-up messages. Figure 2-7 Console Device Determination Flowchart Pow e r-U p/R es et P 00> >> In it C o nso le Enva r C ons ole E nvar = graph ic s...
  • Page 68: Console Device Options

    Power-Up Console Device Options The console device on a DIGITAL Server 7330/7300R series must be either a serial terminal connected to COM1 off the server control module set at 9600 baud or a graphics monitor off an adapter on PCI0. The console program must be AlphaBIOS. During power-up, the SROM and the XSROM always send progress and error messages to the OCP.
  • Page 69: Console Power-Up Display

    Power-Up Console Power-Up Display The last several lines of the power-up display prints appear on a graphics monitor and parts of it print to the control panel display. Example 2-3 Power-Up Display SROM V1.0 on cpu0 SROM V1.0 on cpu1 SROM V1.0 on cpu2 SROM V1.0 on cpu3 š...
  • Page 70 Power-Up At power-up or reset, the SROM code on each CPU module is loaded into that module’s I-cache and tests the module. If all tests pass, the processor’s LED lights. If any test fails, the LED remains off and power-up testing terminates on that CPU. The first determination of the primary processor is made, and the primary processor executes a loopback test to each PCI bridge.
  • Page 71 Power-Up Example 2-3 Power-Up Display (Continued) ž starting console on CPU 0 Ÿ sizing memory 128 MB SYNC 128 MB SYNC starting console on CPU 1 starting console on CPU 2 starting console on CPU 3   probing IOD1 hose 1 bus 0 slot 1 - NCR 53C810 bus 0 slot 2 - DECchip 21041-AA bus 0 slot 3 - NCR 53C810...
  • Page 72 Power-Up ž The final primary CPU determination is made. The primary CPU unloads PALcode and decompression code from the FEPROM on the PCI 0 to its B-cache. The primary CPU then jumps to the PALcode to start the SRM console. The primary CPU prints a message indicating that it is running the console.
  • Page 73: Fail-Safe Loader

    Power-Up Fail-Safe Loader The fail-safe loader is a software routine that loads the SRM console image from floppy. Once the console is running you will want to run LFU to update FEPROM 0 with a new image. NOTE: FEPROM 0 contains images of the SROM, XSROM, decompression, and SRM console code.
  • Page 74 Power-Up 2–26 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 75: Troubleshooting

    Troubleshooting This chapter describes troubleshooting during power-up and booting, as well as diagnostics for DIGITAL Server 7300/7300R series systems. The chapter covers the following topics: • Troubleshooting with LEDs • Troubleshooting Power Problems • Troubleshooting with the Maintenance Bus (I2C Bus) •...
  • Page 76: Troubleshooting With Leds

    Troubleshooting Troubleshooting with LEDs During power-up, reset, initialization, or testing, diagnostics are run on CPUs, memories, bridge modules, PCI motherboards, and sometimes options. The following sections describe possible problems that can be identified by checking LEDs. Figure 3-1 CPU and Bridge Module LEDs Bridge Module LEDs CPU LEDs (IOD 0 &...
  • Page 77: Processor (Cpu) Leds

    Troubleshooting Processor (CPU) LEDs If the CPU STP LED on any processor (CPU) module is lit, that CPU chip is functioning properly. If the CPU STP LED is off, that CPU may or may not be functioning. You can use the Halt button on the OCP to prevent the AlphaBIOS console (which turns off the CPU STP LED) from booting, thus assuring the validity of the CPU STP LED.
  • Page 78: Cabinet Power And Fan Leds

    Troubleshooting Cabinet Power and Fan LEDs Figure 3-2 shows the cabinet power and fan LEDs. Figure 3-2 Cabinet Power and Fan LEDs Fan LED Power LED PK-0664-96 A cabinet system has three exhaust fans at the top of the cabinet. They are powered from a small power supply in the fan tray.
  • Page 79: Troubleshooting Power Problems

    Troubleshooting Troubleshooting Power Problems Power problems can occur before the system is up or while the system is running. If a system stops running, make a habit of checking the PCM. Power Problem List Th e syste m w ill ha lt fo r the fo llow in g: 1.
  • Page 80: If Power Problem Occurs At Power-Up

    Troubleshooting If Power Problem Occurs at Power-Up If the system has a power problem on a cold start, the PCM LEDs are not valid until after DCOK_SENSE has been asserted. The cause is one of the following: • Broken system fan •...
  • Page 81: Power Control Module Leds

    Troubleshooting Power Control Module LEDs The PCM has 11 LEDs visible through the system card cage. The LED display shows the relative placement of the LEDs. Figure 3-3 PCM LEDs DCOK_SENSE PS0_OK PS1_OK PS2_OK TEMP_OK CPUFAN_OK SYSFAN_OK CS_FAN0 CS_FAN1 CS_FAN2 C_FAN3 Normally On Tested at one-second intervals...
  • Page 82: Table 3-1 Power Control Module Led States

    Troubleshooting Table 3-1 Power Control Module LED States State Description DCOK_SENSE Both +5.0V and +3.43V are present and within limits. PS0_OK Power supply 0 is present and has asserted POK_H. PS1_0K Power supply 1 is present and has asserted POK_H. Power supply 1 not present.
  • Page 83: Troubleshooting With The Maintenance Bus

    Troubleshooting Troubleshooting with the Maintenance Bus (I C Bus) The I C bus (referred to as the “I squared C bus”) is a small internal maintenance bus used to monitor system conditions scanned by the power control module, write the fault display, store error state, and track configuration information in the system.
  • Page 84: Monitoring System Conditions

    Troubleshooting Monitoring System Conditions The I C bus monitors the state of system conditions scanned by the PCM. There are two registers on the PCM: One records the state of the fans and power supplies and is latched when there is a fault. The other causes an interrupt on the I C bus when a CPU or system fan fails, an over- temperature condition exists, or power supplied to the system is out of tolerance.
  • Page 85: Running Diagnostics - Test Command

    Troubleshooting Running Diagnostics — Test Command The test command runs diagnostics on the entire system, CPU devices, memory devices, and the PCI I/O subsystem. The test command runs only from the SRM console. Ctrl/C stops the test. Example 3-1 Test Command Syntax P00>>>...
  • Page 86: Testing An Entire System

    Troubleshooting Testing an Entire System A test command with no modifiers runs all exercisers for subsystems and devices on the system. I/O devices tested are supported boot devices. The test runs for 10 minutes. Example 3-2 Sample Test Command P00>>> test Console is in diagnostic mode System test, runtime 600 seconds Type ^C to stop testing...
  • Page 87 Troubleshooting Starting processor/cache thrasher on each CPU.. Testing SCSI disks (read-only) No CD/ROM present, skipping embedded SCSI test Testing other SCSI devices (read-only).. Testing floppy drive (dva0, read-only) Program Device Pass Hard/Soft Bytes Written Bytes Read -------- ------------ ------------ ------ --------- ------------- ------------ 00003047 memtest memory 134217728...
  • Page 88 Troubleshooting Program Device Pass Hard/Soft Bytes Written Bytes Read -------- ------------ ------------ ------ --------- ------------- ------------ 00003047 memtest memory 727711744 727711744 00003050 memtest memory 1054 1104015744 1104015744 00003059 memtest memory 1039 1088289024 1088289024 00003062 memtest memory 1041 1090385920 1090385920 00003084 memtest memory 467607808 467607808...
  • Page 89: Testing Memory

    Troubleshooting Testing Memory The test mem command tests individual memory devices or all memory. The test shown in Example 3-3 runs for 2 minutes. Example 3-3 Sample Test Memory Command P00>>> test memory Console is in diagnostic mode System test, runtime 120 seconds Type ^C to stop testing Starting background memory test, affinity to all CPUs..
  • Page 90 Troubleshooting 000046fb memtest memory 220174080 220174080 Program Device Pass Hard/Soft Bytes Written Bytes Read -------- ------------ ------------ ------ --------- ------------- ------------ 000046d7 memtest memory 404750336 404750336 000046e0 memtest memory 1011 1058932480 1058932480 000046e9 memtest memory 1000 1047399552 1047399552 000046f2 memtest memory 1046351104 1046351104 000046fb...
  • Page 91 Troubleshooting Memory test complete Test time has expired... P00>>> DIGITAL Server 7300/7300R Series Service Manual 3-17...
  • Page 92: Testing Pci Buses And Devices

    Troubleshooting Testing PCI Buses and Devices The test pci command tests PCI buses and devices. The test runs for 2 minutes. Example 3-4 Sample Test Command for PCI P00>>> test pci* Console is in diagnostic mode System test, runtime 120 seconds Type ^C to stop testing Configuring all PCI buses..
  • Page 93 Troubleshooting Program Device Pass Hard/Soft Bytes Written Bytes Read -------- ------------ ------------ ------ --------- ------------- ------------ 00002c29 exer_kid dkb200.2.0.3 14642176 00002c2a exer_kid dkb400.4.0.3 14642176 00002c5e exer_kid dva0.0.0.100 Program Device Pass Hard/Soft Bytes Written Bytes Read -------- ------------ ------------ ------ --------- ------------- ------------ 00002c29 exer_kid dkb200.2.0.3 48689152...
  • Page 94 Troubleshooting 3-20 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 95: Power System

    Power System This chapter describes the DIGITAL Server 7300/7300R series power system: • Power Supply • Power Control Module Features • Power Circuit and Cover Interlocks • Power-Up/Down Sequence • Cabinet Power Configuration Rules • Pedestal Power Configuration Rules (North America and Japan) •...
  • Page 96: Power Supply

    Power System Power Supply Power supply outputs are shown in Figure 4-1. Figure 4-1 Power Supply Outputs M isc. S ignal C urrent share +5V /R eturn +3.4V /R eturn +3.4V /R eturn +12V /R etu rn P K W 0 40 2 A -96 4–2 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 97 Power System Power Supply Features • 90–264 Vrms input • 450 watts output. Output voltages are as follows: Output Voltage Min. Voltage Max. Voltage Max. Current +5.0 4.85 5.25 +3.43 3.400 3.465 11.5 12.6 –12 –10.9 –13.2 –5.0 –4.6 –5.5 Vaux 0.05 •...
  • Page 98: Power Control Module Features

    Power System Power Control Module Features The power control module (54-24117-01) is located behind the B3040-AA module, the system bus to PCI bus bridge module. Figure 4-2 Power Control Module System Motherboard Power Control Module Slot PK-0710-96 The power control module performs the following functions: 4–4 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 99 Power System • Controls the power-up/down sequencing. • Monitors the combined output of power supplies VDD (3.43V) and VCC (5.0V) and asserts DCOK_SENSE if these voltages are within range and asserts POWER_FAULT_L causing an immediate power shutdown if either is not. •...
  • Page 100: Power Circuit And Cover Interlocks

    Power System Power Circuit and Cover Interlocks Figure 4-3 is a diagram of the power circuit. Note that B305n in the diagram stands for either the B3050-AA or B3052-AA PCI Motherboard. Figure 4-3 Power Circuit Diagram 17-04217-01 Logic Power Supply Cover Interlocks 17-04201-01...
  • Page 101 Power System Figure 4-3 shows the distribution of power throughout the system drawer. Opens in the circuit, the PCM signal POWER_FAULT_L or the SCM signal RSM_DC_EN_L interrupt DC power applied to the system. The opens can be caused by the On/Off button or by the cover interlocks. The POWER_FAULT_L signal is asserted by the PCM module if it detects a fault and the RSM_DC_EN_L is controlled remotely.
  • Page 102: Power-Up/Down Sequence

    Power System Power-Up/Down Sequence The On/Off button can be controlled manually or remotely. The button is on the OCP. Remote power control is provided though the remote I/O port connected to the PCI. The power-up/down sequence flow is shown below. Figure 4-4 Power Up/Down Sequence Flowchart A p p ly A C Po w e r...
  • Page 103 Power System hard fault on power-up, the power supplies shut down immediately. If there is not a hard fault on power-up, the power system powers up and remains up until the system is shut off or the PCM senses a fault. If the PCM senses a power fault, the power system attempts to restore power and will restore power if the fault is not sensed a second time.
  • Page 104: Cabinet Power Configuration Rules

    Power System Cabinet Power Configuration Rules There are different cabinets with different power delivery systems. See Chapter 1 for a description of the differences. A bar code label designating the cabinet variation is located inside the back door in the upper left corner of the bezel holding the door. The two variations are: H9A10-EN and H9A10-EP.
  • Page 105: Figure 4-6 -En Three Drawer Cabinet Power Configuration

    Power System Figure 4-6 shows an -EN three-drawer cabinet power configuration. The three-drawer -EN is shown with the H7600-AA controller. Figure 4-6 -EN Three Drawer Cabinet Power Configuration 2 Pow e r S y s te m D raw e r S y s te m D raw e r C o n tro lle rs 3 .6 7 A rm s...
  • Page 106: Pedestal Power Configuration Rules (North America And Japan)

    Power System Pedestal Power Configuration Rules (North America and Japan) Figure 4-7 show pedestal power distribution in North America and Japan. Figure 4-7 Pedestal Power Distribution (N.A. and Japan) S torag eW orks S torageW orks Pow er S trips 0.75 A rm s 0.75 A rm s 0.75 A rm s...
  • Page 107 Power System Power Strip Single AC power strip supports one system drawer and one StorageWorks shelf. When two AC power strips are used, combined AC input line current cannot exceed the site circuit breaker restriction, assuming both strips are plugged in to the same circuit.
  • Page 108: Pedestal Power Configuration Rules (Europe And Asia Pacific)

    Power System Pedestal Power Configuration Rules (Europe and Asia Pacific) Figure 4-8 shows pedestal power distribution in Europe and Asia/Pacific. Figure 4-8 Pedestal Power Distribution (Europe and AP) Pow er S trips S torageW orks S tora geW orks 0 .34 A rm s 0 .34 A rm s 0 .34 A rm s S ystem D raw er...
  • Page 109: Error Detection With Error Registers

    Error Detection with Error Registers This chapter describes error detection with error registers. It includes the following topics: • Overview of Error Detection • Error Registers • Troubleshooting IOD-Detected Errors • Double Error Halts and Machine Checks While in PAL Mode DIGITAL Server 7300/7300R Series Service Manual 5–1...
  • Page 110: Overview Of Error Detection

    Error Detection with Error Registers Overview of Error Detection Error detection is performed by CPUs, the IOD, and the EISA to PCI bus bridge. (The IOD is the acronym used by software to refer to the system bus to PCI bus bridge.) Figure 5-1 Error Detector Placement M em ory...
  • Page 111 Error Detection with Error Registers Lines Protected Device ECC Protected System bus data lines IOD on every transaction, CPU when using the bus B-cache IOD on every transaction, CPU when using the bus Parity Protected System bus command/address lines IOD on every transaction, CPU when using the bus Duplicate tag store IOD on every transaction,...
  • Page 112 Error Detection with Error Registers Internal EV5 or EV56 cache errors CPU B-cache module errors • System-dependent errors detected by both the CPU and IOD. These errors are system machine checks and are: CPU-detected external reference errors IOD hard error interrupts The IOD can detect hard errors on either side of the bridge.
  • Page 113: Error Registers

    Error Detection with Error Registers Error Registers The DIGITAL Server 7300/7300R include registers that hold error information that you can use for troubleshooting. These registers include: • External Interface Status Register – EI_STAT • External Interface Address Register - EI_ADDR •...
  • Page 114 Error Detection with Error Registers External Interface Status Register – EI_STAT The EI_STAT register is a read-only register that is unlocked and cleared by any PALcode read. Subject to some restrictions, a read of EI_STAT also unlocks the EL_ADDR, BC_TAG_ADDR, and FILL_SYN registers. EI_STAT is not unlocked or cleared by reset.
  • Page 115 Error Detection with Error Registers Fill data from B-cache or main memory can have correctable or non-correctable errors in ECC mode. In parity mode, fill data parity errors are treated as non-correctable hard errors. System address/command parity errors are always treated as non-correctable hard errors, irrespective of the mode.
  • Page 116: Table 5-1 External Interface Status Register

    Error Detection with Error Registers Table 5-1 External Interface Status Register Name Bits Type Description COR_ECC_ERR <31> Correctable ECC Error. Indicates that fill data received from outside the CPU contained a correctable ECC error. EI_ES <30> External Interface Error Source. When set, indicates that the error source is fill data from main memory or a system address/command parity error.
  • Page 117 Error Detection with Error Registers Table 5-1 External Interface Status Register (continued) Name Bits Type Description <63:36 All ones. > SEO_HRD_ERR <35> Second External Interface Hard Error. Indicates that a fill from B-cache or main memory, or a system address/command received by the CPU has a hard error while one of the hard error bits in the EI_STST register is already set.
  • Page 118: External Interface Address Register - Ei_Addr

    Error Detection with Error Registers External Interface Address Register - EI_ADDR The EI_ADDR register contains the physical address associated with errors reported by the EI_STAT register. It is unlocked by a read of the EI_STAT Register. This register is meaningful only when one of the error bits is set. Address FF FFF0 0148 Access...
  • Page 119: Table 5-2 Loading And Locking Rules For External Interface Registers

    Error Detection with Error Registers Table 5-2 Loading and Locking Rules for External Interface Registers Correct Non-correct Second Load Lock Action When EI_STAT is -able Error -able Error Hard Error Register Register Read Clears and possible unlocks all registers Clears and possible unlocks all registers...
  • Page 120: Mc Error Information Register 0 (Mc_Err0 - Offset = 800)

    Error Detection with Error Registers MC Error Information Register 0 (MC_ERR0 - Offset = 800) The low-order MC bus (system bus) address bits are latched into this register when the system bus to PCI bus bridge detects an error event. If the event is a hard error, the register bits are locked.
  • Page 121 Error Detection with Error Registers MC Error Information Register 0 (MC_ERR0 - Offset = 800) The low-order MC bus (system bus) address bits are latched into this register when the system bus to PCI bus bridge detects an error event. If the event is a hard error, the register bits are locked.
  • Page 122: Mc Error Information Register 1 (Mc_Err1 - Offset = 840)

    Error Detection with Error Registers MC Error Information Register 1 (MC_ERR1 - Offset = 840) The high-order MC bus (system bus) address bits and error symptoms are latched into this register when the system bus to PCI bus bridge detects an error. If the event is a hard error, the register bits are locked.
  • Page 123 Error Detection with Error Registers Table 5-5 MC Error Information Register 1 Initial Name Bits Type State Description VALID <31> Logical OR of bits <30:23> in the CAP_ERR Register. Set if MC_ERR0 and MC_ERR1 contain a valid address. Reserved <30:21> Dirty <20>...
  • Page 124: Cap Error Register (Cap_Err - Offset = 880)

    Error Detection with Error Registers CAP Error Register (CAP_ERR - Offset = 880) CAP_ERR is used to log information pertaining to an error detected by the CAP or MDP ASIC. If the error is a hard error, the register is locked. All bits, except the LOST_MC_ERR bit, are locked on hard errors.
  • Page 125: Table 5-6 Cap Error Register

    Error Detection with Error Registers Table 5-6 CAP Error Register Name Bits Type Initial Description State MC_ERR VALID <31> Logical OR of bits <30:23> in this register. When set MC_ERR0 and MC_ERR1 are latched. RDSB <30> RW1C Non-correctable ECC error detected by MDPB.
  • Page 126 Error Detection with Error Registers Table 6-5 CAP Error Register (continued) Name Bits Type Initial Description State LOST_MC_ERR <24> RW1C Set when an error is detected but not logged because the associated symptom fields and registers are locked with the state of an earlier error.
  • Page 127: Pci Error Status Register 1 (Pci_Err1 - Offset = 1040)

    Error Detection with Error Registers PCI Error Status Register 1 (PCI_ERR1 - Offset = 1040) PCI_ERR1 is used by the system bus to PCI bus bridge to log bus address <31:0> pertaining to an error condition logged in CAP_ERR. This register always captures PCI address <31:0>, even for a PCI DAC cycle.
  • Page 128: Troubleshooting Iod-Detected Errors

    Error Detection with Error Registers Troubleshooting IOD-Detected Errors Step 1 Read the CAP Error Registers on both PCI bridges (F9E0000880 and FBE0000880). If one or both of these registers shows an error, match the register contents with the data pattern and perform the action indicated.
  • Page 129: System Bus Ecc Error

    Error Detection with Error Registers System Bus ECC Error Step 2 Read the MC_ERR1 register and match the contents with the data pattern. Perform the action indicated. Table 5-9 System Bus ECC Error Data Pattern MC_ERR1 Data Pattern Most Likely Cause Action For Memory Read 1000 0000 0000 xxxx xxxx 10xx 0xxx xxxx...
  • Page 130: System Bus Nonexistent Address Error

    Error Detection with Error Registers System Bus Nonexistent Address Error Step 3 Determine which node (if any) should have responded to the command/address identified in MC_ERR1. Perform the action indicated. Table 5-10 System Bus Nonexistent Address Error Troubleshooting MC_ERR1 Data Pattern Most Likely Cause Action 1000 0000 000x xxxx xxxx xxxx 0xxx xxxx...
  • Page 131: System Bus Address Parity Error

    Error Detection with Error Registers System Bus Address Parity Error Step 4 Determine which node put the bad command/address on the system bus identified in MC_ERR1. Perform the action indicated. Table 5-11 Address Parity Error Troubleshooting MC_ERR1 Data Pattern Most Likely Cause Action 1000 0000 000x xxx0 10xx xxxx xxxx xxxx Data sourced by MID = 2...
  • Page 132 Error Detection with Error Registers PIO Buffer Overflow Error (PIO_OVFL) Step 5 Enter the value of the CAP_CTRL register bits<19:16> (Actual_PEND_NUM) in the following formula. Compare the results as indicated in Table 5-12 to determine the most likely cause of the error. When an IOD is implicated in the analysis of the error, replace the one that captured the error in its CAP Error Register.
  • Page 133: Page Table Entry Invalid Error

    Error Detection with Error Registers Page Table Entry Invalid Error Step 6 This error is almost always a software problem. However, if the software is known to be good and the hardware is suspected, swap the IOD. PCI Master Abort Step 7 Master aborts normally occur when the operating system is sizing the PCI bus.
  • Page 134: Broken Memory

    Error Detection with Error Registers Broken Memory Step 10 Refer to the following sections. For a Read Data Substitute Error (Non-Correctable ECC Error) When a read data substitute (RDS) error occurs, determine which memory module pair caused the error as follows: 1.
  • Page 135: Table 5-13 Ecc Syndrome Bits Table

    Error Detection with Error Registers 3. When you have isolated the failing memory pair, determine which of the two modules is bad. (You cannot do this if the operating system is Windows NT.) Read the CPU FIL SYNDROME Register. If this register is non-zero, use the ECC syndrome bits in Table 5-13 to determine which module had the single-bit error.
  • Page 136: Command Codes

    Error Detection with Error Registers Command Codes Table 5-14 shows the codes for transactions on the system bus and how they are affected by the commander in charge of the bus during the transaction. The command is a six-bit field in the command address (bits<5:0>). Bit-to-text translations give six-bit data (although the top two bits may or may not be relevant).
  • Page 137: Node Ids

    Error Detection with Error Registers Table 5-14 Decoding Commands (continued) MC_C Description No B- B-Cache Cache 3 2 1 0 <39> 0 1 1 1 0/2 7 Write Merge - Mem 1 0 0 0 Read0 - Mem 1 0 0 0 Read0 - I/O 1 0 0 1 Read1 - Mem...
  • Page 138: Table 5-15 Node Ids

    Error Detection with Error Registers Table 5-15 Node IDs Node ID <2:0> Six Bit (Hex) Node 0 0 0 0 0 1 Memory 0 1 0 CPU0 0 1 1 CPU1 1 0 0 IOD0 1 0 1 IOD1 1 1 0 CPU2 1 1 1 CPU3...
  • Page 139: Double Error Halts And Machine Checks While In Pal Mode

    Error Detection with Error Registers Double Error Halts and Machine Checks While in PAL Mode Two error cases require special attention: double error halts and machine checks while the machine is in PAL mode. Information is available that can help determine what error occurred.
  • Page 140: Double Error Halt

    Error Detection with Error Registers Double Error Halt A double error halt occurs under the following conditions: • A machine check occurs. • PAL completes its tasks and returns control of the system to the operating system. • A second machine check occurs before the operating system completes its tasks. The machine returns to the console and displays the following message: halt code = 6 double error halt...
  • Page 141 Error Detection with Error Registers The info 3 command (Example 5-1) causes the SRM console to read the “impure area,” which contains the state of the CPU before it entered PAL. Example 5-1 INFO 3 Command P00>>> info 3 cpu00 per_cpu impure area 00004400 cns$flag...
  • Page 142 Error Detection with Error Registers cns$astrr 00000000 : 0370 cns$astrr+4 00000000 : 0374 cns$isr 00400000 : 0378 cns$isr+4 00000000 : 037c cns$ivptbr 00000000 : 0380 cns$ivptbr+4 00000002 : 0384 cns$mcsr 00000000 : 0388 cns$mcsr+4 00000000 : 038c cns$dc_mode 00000001 : 0390 cns$dc_mode+4 00000000 : 0394 cns$maf_mode...
  • Page 143 Error Detection with Error Registers cns$sc_ctl 0000f000 : 03f0 cns$sc_ctl+4 00000000 : 03f4 cns$bc_tag_addr ff7fefff : 03f8 cns$bc_tag_addr+4 ffffffff : 03fc cns$ei_stat 04ffffff : 0400 cns$ei_stat+4 fffffff0 : 0404 cns$fill_syn 000000a7 : 0410 cns$fill_syn+4 00000000 : 0414 cns$ld_lock 0004eaef : 0418 cns$ld_lock+4 ffffff00 : 041c DIGITAL Server 7300/7300R Series Service Manual 5–35...
  • Page 144 Error Detection with Error Registers Example 5-2 INFO 5 Command P00>>> info 5 cpu00 per_cpu logout area 00004838 mchk$crd_flag 00000320 : 0000 mchk$crd_flag+4 00000000 : 0004 mchk$crd_offsets 00000118 : 0008 mchk$crd_offsets+4 00001328 : 000c mchk$crd_mchk_code 00980000 : 0010 mchk$crd_mchk_code+4 00000000 : 0014 mchk$crd_ei_stat eba00003 : 0018 mchk$crd_ei_stat+4...
  • Page 145 Error Detection with Error Registers mchk$dc_perr_stat+4 00000000 : 0154 mchk$va ff8000a0 : 0158 mchk$va+4 ffffffff : 015c mchk$mm_stat 000149d0 : 0160 mchk$mm_stat+4 00000000 : 0164 mchk$sc_addr 0001904f : 0168 mchk$sc_addr+4 ffffff00 : 016c mchk$sc_stat 00000000 : 0170 mchk$sc_stat+4 00000000 : 0174 mchk$bc_tag_addr ff7fefff : 0178 mchk$bc_tag_addr+4...
  • Page 146 Error Detection with Error Registers IOD: 1 base address: fbe0000000 WHOAMI: 0000003a PCI_REV: 06000221 CAP_CTL: 02490fb1 HAE_MEM: 00000000 HAE_IO: 00000000 INT_CTL: 00000003 INT_REQ: 00800000 INT_MASK0:00010000 INT_MASK1:00000000 MC_ERR0: e0000000 MC_ERR1: 800e88fd CAP_ERR: 84000000 PCI_ERR: 00000000 MDPA_STAT:00000000 MDPA_SYN: 00000000 MDPB_STAT:00000000 MDPB_SYN: 00000000 5–38 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 147 Error Detection with Error Registers Example 5-3 INFO 8 Command P00>>> info 8 IOD 0 WHOAMI: 0000003a PCI_REV: 06008221 CAP_CTL: 02490fb1 HAE_MEM: 00000000 HAE_IO: 00000000 INT_CTL: 00000003 INT_REQ: 00000000 INT_MASK0: 00210000 INT_MASK1:00000000 MC_ERR0: e0000000 MC_ERR1: 000e88fd CAP_ERR: 00000000 PCI_ERR: 00000000 MDPA_STAT: 00000000 MDPA_SYN: 00000000 MDPB_STAT:00000000 MDPB_SYN: 00000000 INT_TARG: 0000003a INT_ADR:...
  • Page 148 Error Detection with Error Registers INT_TARG: 0000003a INT_ADR: 00006000 INT_ADR_EXT: 0 0000000 PERF_MON: 004e31a6 PERF_CONT:00000000 CAP_DIAG: 00000000 DIAG_CHKA:10000000 DIAG_CHKB:10000000 SCRATCH: 00000000 W0_BASE: 00100001 W0_MASK: 00000000 T0_BASE: 00001000 W1_BASE: 00800001 W1_MASK: 00700000 T1_BASE: 00008000 W2_BASE: 80000001 W2_MASK: 3ff00000 T2_BASE: 00000000 W3_BASE: 00000000 W3_MASK: 1ff00000 T3_BASE: 0000a000...
  • Page 149: Removal And Replacement

    Removal and Replacement This chapter describes removal and replacement procedures for field-replaceable units (FRUs). It covers the following topics: • System Safety • FRU List • Power System FRUs • CPU Removal and Replacement • Memory Removal and Replacement • Power Supply Removal and Replacement •...
  • Page 150: System Safety

    Removal and Replacement System Safety Observe the safety guidelines in this section to prevent personal injury. CAUTION: Wear an anti-static wrist strap whenever you work on a system. The DIGITAL Server 7300/7300R series cabinet system has a wrist strap connected to the frame at the front and rear. The pedestal system does not have an attached strap, so you will have to take one to the site.
  • Page 151: Fru List

    Removal and Replacement FRU List Figure 6-1 shows the locations of FRUs in the system drawer. Table 6-1 lists the part numbers of all field-replaceable units. Figure 6-1 System Drawer FRU Locations M e m o r y M o d u le s C P U M o d u le s To p C o ve r O p tio n a l a n d N + 1...
  • Page 152: Table 6-1 Field-Replaceable Unit Part Numbers

    Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers CPU Modules B3105-AA 400Mhz 4MB cached B3105-CA 533Mhz 4MB cached Memory Modules B3020-CA 64 Mbyte synch B3030-EA 256 Mbyte asynch (EDO) B3030-FA 512 Mbyte asynch (EDO) B3030-GA 2 Gbyte asynch (EDO) Required System Drawer Modules and Display 54-23803-01 System motherboard...
  • Page 153 Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) Power System Components 30-44712-01 Power supply (H7291-AA) 30-46788-01 Internal power source 40W/12V fan tray power (cabinet) H7600-AA Power controller (NA/Japan, H9A10-EN cabinet) H7600-DB Power controller (Europe/AP, H9A10-EP cabinet) 12-23501-01 NEMA power strip (N.A./Japan, pedestal) 12-45334-02 IEC power strip (Europe/AP, pedestal, and all cabinet systems)
  • Page 154 Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) System Drawer Cables and Jumpers From 17-04196-01 Server control Remote I/O SCM signal conn module signal signal conn on cable (60 pin) PCI mbrd 17-04199-01 Current share Current share Current share conn on PS1 cable conn on PS0 and PS2...
  • Page 155 Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) System Drawer Cables and Jumpers From 17-04292-01 SCSI CD-ROM CD-ROM CD-ROM sig conn sig cable conn on PCI mbrd 70-32016-01 Interlock switches Interlock Other OCP DC enable pwr and cable switch assy conn or pwr conn on ped tray pwr drive cable (17-...
  • Page 156 Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) Pedestal Cables From 17-04293-01 Elec harness Power Ped tray bulkhead (system power harness side) cable+5/+12 (17-04217- 17-04302-01 OCP signal cable OCP sig conn OCP sig conn on ped tray on PCI mbrd bulkhead (system side) 17-04305-01 Harness power...
  • Page 157: Power System Frus

    Removal and Replacement Power System FRUs Figure 6-2 Location of Power System FRUs Fan 0 Fan 1 Motherboard Fan 2 Fan Tray Cabinet B3040 To Pedestal B305n Power Source Floppy To Cabinet Pedestal Tray Power Source Tray Interlock SCSI Notes: Only power cables are shown.
  • Page 158 Removal and Replacement Part Number Description 17-04285-01 Power cord to power strip. .5 meter, IEC320 to IEC320 connector used in cabinet systems only. In pedestal systems, cords match country- specific wall outlets. H7600-AA Power controller used in place of 12-45334-02 and 17-04285-02 in the H9A10-EN cabinet in N.
  • Page 159: System Drawer Exposure (Cabinet)

    Removal and Replacement System Drawer Exposure (Cabinet) There is one type of cabinet for these systems: the H9A10-EN/-EP cabinet. In the H9A10-EN and -EP Cabinet, the system drawer sits on a tray that slides out of the front of the cabinet. You must pull the stabilizer bar out from the bottom to prevent the cabinet from tipping over.
  • Page 160 Removal and Replacement CAUTION: The cabinet could tip over if a system drawer is pulled out and the stabilizing bar is not fully extended and its leveler foot on the floor. Exposing Any Section of the System Drawer in an H9A10-EN or -EP Cabinet. 1.
  • Page 161: System Drawer Exposure (Pedestal)

    Removal and Replacement System Drawer Exposure (Pedestal) Figure 6-4 Exposing System Drawer (Pedestal) Pedestal Tray S ystem B us C over C over Pedestal Tray a nd Pow er S ection C over P C I B us C over 3.: DIGITAL Server 7300/7300R Series Service Manual 6–13...
  • Page 162 Removal and Replacement Exposing the System Drawer 1. Open the front door and remove it by lifting and pulling it away from the system. 2. Remove the top cover. Unscrew the two Phillips head screws midway up on each side of the pedestal, tilt the cover up, and lift it away from the frame.
  • Page 163: Cpu Removal And Replacement

    Removal and Replacement CPU Removal and Replacement CAUTION: Two different CPU modules work in these systems: the B3107-AA and the B3107-CA. Unless you are upgrading, be sure you are replacing the broken module with the same variant. Figure 6-5 Removing a CPU Module C P U M odu le S yste m B us C ard C age...
  • Page 164 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 165: Cpu Fan Removal And Replacement

    Removal and Replacement CPU Fan Removal and Replacement Figure 6-6 Removing CPU Fan P K W 4 1 1 A -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–17...
  • Page 166 Removal and Replacement Removal 1. Follow the CPU Removal and Replacement procedure. 2. Unplug the fan from the module. 3. Remove the four Phillips head screws holding the fan to the Alpha chip’s heat sink. Replacement Reverse the above procedure. Verification If the system powers up, the CPU fan is working.
  • Page 167: Memory Removal And Replacement

    Removal and Replacement Memory Removal and Replacement CAUTION: Several different memory modules work in these systems. Be sure you are replacing the broken module with the same variant. Figure 6-7 Removing a Memory Module M em ory M odu le S ystem B us C ard C age P K W 04 0 8-9 6...
  • Page 168 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 169: Power Control Module Removal And Replacement

    Removal and Replacement Power Control Module Removal and Replacement Figure 6-8 Removing Power Control Module Pow er C o ntrol M odule (P C M ) 6\VWHP %XV &DUG &DJH P K W 04 12 -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–21...
  • Page 170 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 171: System Bus To Pci Bus Bridge (B3040-Aa) Module Removal And Replacement

    Removal and Replacement System Bus to PCI Bus Bridge (B3040-AA) Module Removal and Replacement Figure 6-9 Removing System Bus to PCI/EISA Bus Bridge Module P K W 0 4 13 -96 DIGITAL Server 7300/7300R Series Service Manual 6–23...
  • Page 172 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 173: System Motherboard Removal And Replacement

    Removal and Replacement System Motherboard Removal and Replacement The system motherboard contains an NVRAM that holds the system serial number. Be sure to record this number before replacing the module. The serial number is on a bar code on the side of the system drawer or on the system bus card cage. The part number is 54-23803-01.
  • Page 174 Removal and Replacement 5. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 6. Remove all the PCI/EISA options. 7. Remove the server control module. 8. Remove the PCI motherboard. 9.
  • Page 175: Pci/Eisa Motherboard (B3050/B3052) Removal And Replacement

    Removal and Replacement PCI/EISA Motherboard (B3050/B3052) Removal and Replacement Figure 6-11 Replacing PCI/EISA Motherboard C o nnection to B ridge M odule P C I M otherboard P K W 0 4 09 -96 Removal The PCI motherboard contains an NVRAM with ECU data and customized console environment variables.
  • Page 176 Removal and Replacement 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 177: Server Control Module Removal And Replacement

    Removal and Replacement Server Control Module Removal and Replacement Figure 6-12 Removing Server Control Module S CM B ulkhead C onnectors Keyboard CO M 1 Parallel 12V D C M ous e C O M 2 M odem P C I M otherboard C onnectors D is kette D rive O C P R em ote I/O...
  • Page 178 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 179: Pci/Eisa Option Removal And Replacement

    Removal and Replacement PCI/EISA Option Removal and Replacement Figure 6-13 Removing PCI/EISA Option PKW0418-96 WARNING: To prevent fire, use only modules with current limited outputs. See National Electrical Code NFPA 70 or Safety of Information Technology Equipment, Including Electrical Business Equipment EN 60 950.
  • Page 180 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 181: Power Supply Removal And Replacement

    Removal and Replacement Power Supply Removal and Replacement Figure 6-14 Removing Power Supply Jumper 17-04199-01 Cable Harness 17-04217-01 ML014295 DIGITAL Server 7300/7300R Series Service Manual 6–33...
  • Page 182 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the cover to the power section of the drawer. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 183: Power Harness Removal And Replacement

    Removal and Replacement Power Harness Removal and Replacement Figure 6-15 Removing Power Harness Holding Bracket System Bus Motherboard Fans Power Supplies PCI Bus Motherboard OCP Tray To OCP PCI Bus Motherboard CD-ROM To Floppy To OCP OCP Tray PKW0419-96 DIGITAL Server 7300/7300R Series Service Manual 6–35...
  • Page 184 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the power, system card cage, and PCI/EISA sections of the drawer by removing all covers. Unscrew the Phillips head screws holding each cover in place and slide the covers off the drawer.
  • Page 185: System Drawer Fan Removal And Replacement

    Removal and Replacement System Drawer Fan Removal and Replacement Figure 6-16 Removing System Drawer Fan P K W 04 1 6-9 6 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the power system, the system card cage, and the PCI card cage sections of the drawer by removing all three covers.
  • Page 186 Removal and Replacement 4. Release the power supply tray by removing the two Phillips head screws on the side of the drawer. 5. Lift the power supply tray to release it from the sheet metal and slide it out from the drawer.
  • Page 187: Cover Interlock Removal And Replacement

    Removal and Replacement Cover Interlock Removal and Replacement Figure 6-17 Removing Cover Interlocks 3 C over Interlock S w itches 70-32016-01 To O C P P K W -04 0 3D -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–39...
  • Page 188 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove all three section covers to expose the interlock switch assembly. 4. Remove the two screws holding the interlock in place. 5.
  • Page 189: Operator Control Panel Removal And Replacement (Cabinet)

    Removal and Replacement Operator Control Panel Removal and Replacement (Cabinet) Figure 6-18 Removing OCP (Cabinet) P K W 0 4 1 7 C -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–41...
  • Page 190 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. While you need not remove the tray containing the OCP, you do need to slide it forward to access the OCP retaining screws under the tray. The tray is attached to the power system section cover.
  • Page 191: Operator Control Panel Removal And Replacement (Pedestal)

    Removal and Replacement Operator Control Panel Removal and Replacement (Pedestal) Figure 6-19 Removing OCP (Pedestal) 3.:  DIGITAL Server 7300/7300R Series Service Manual 6–43...
  • Page 192 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer far enough to disconnect cables attached to the OCP, the floppy, and the CD-ROM drive.
  • Page 193: Floppy Removal And Replacement

    Removal and Replacement Floppy Removal and Replacement Figure 6-20 Removing Floppy Drive P K W 0 4 17 B -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–45...
  • Page 194 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer and disconnect cables attached to the OCP (unnecessary on a pedestal system), the floppy, and the CD-ROM drive.
  • Page 195: Cd-Rom Removal And Replacement

    Removal and Replacement CD-ROM Removal and Replacement Figure 6-21 Removing CD-ROM P K W 04 1 7A -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–47...
  • Page 196 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer and disconnect cables attached to the OCP (unnecessary on a pedestal system), the floppy, and the CD-ROM drive.
  • Page 197: Cabinet Fan Tray Removal And Replacement

    Removal and Replacement Cabinet Fan Tray Removal and Replacement Figure 6-22 Removing Cabinet Fan Tray Fan L ED Power LE D Power To S C M P K W 0 4 4 1 A -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–49...
  • Page 198 Removal and Replacement Removal 1. Shut down the operating system and power down the system. Unplug the AC power cable from the cabinet tray power supply. 2. If present, unplug any power cables going to the server control modules at the back of system drawers.
  • Page 199: Cabinet Fan Tray Power Supply Removal And Replacement

    Removal and Replacement Cabinet Fan Tray Power Supply Removal and Replacement Figure 6-23 Removing Cabinet Fan Tray Power Supply G round To fan fail detect board O ffsets To fans Power supply Power supply P K W 0 4 4 1 B -9 6 cover DIGITAL Server 7300/7300R Series Service Manual 6–51...
  • Page 200 Removal and Replacement Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan fail detect module and each fan. 3. Remove the power supply cover. It is held in place by two screws that go through the AC bulkhead spot welded to the tray weldment.
  • Page 201: Cabinet Fan Tray Fan Removal And Replacement

    Removal and Replacement Cabinet Fan Tray Fan Removal and Replacement Figure 6-24 Removing Cabinet Fan Tray Fan P K W 0 44 1F -96 DIGITAL Server 7300/7300R Series Service Manual 6–53...
  • Page 202 Removal and Replacement Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan you wish to replace. 3. Remove the fan finger guard. 4. Remove the two remaining screws holding the fan to the tray and remove the fan. 5.
  • Page 203: Cabinet Fan Tray Fan Fail Detect Module Removal And Replacement

    Removal and Replacement Cabinet Fan Tray Fan Fail Detect Module Removal and Replacement Figure 6-25 Removing Fan Tray Fan Fail Detect Module P K W 04 4 1D -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–55...
  • Page 204 Removal and Replacement Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan fail detect module. 3. Remove the fan fail detect module. In early systems, the module is held in place by three screws that go through the weldment, through three standoffs, through the module to nuts.
  • Page 205: Storageworks Shelf Removal And Replacement

    Removal and Replacement StorageWorks Shelf Removal and Replacement Figure 6-26 Removing StorageWorks Shelf Cabinet S torageW orks S helf M ountin g R ails S tora geW orks S helf (H 910A -E C ) M ounting R ails (H 910A -E B ) Pedestal P K W 0 4 51 -96 DIGITAL Server 7300/7300R Series Service Manual 6–57...
  • Page 206 Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Remove the power cord and signal cord(s) from the StorageWorks shelf. 3. Remove the two retaining brackets holding the shelf in the mounting rail by removing the Phillips head screws holding the brackets in place.
  • Page 207: Running Utilities

    Running Utilities This chapter provides a brief overview of how to load and run utilities. The following topics are covered: • Selecting Utilities from the AlphaBIOS Menu • Running Utilities from a Serial Terminal • Running the EISA Configuration Utility •...
  • Page 208: Selecting Utilities From The Alphabios Menu

    Running Utilities Selecting Utilities from the AlphaBIOS Menu Start AlphaBIOS and select Utilities from the menu. The next selection depends on the utility to be run. For example, to run ECU, select Run ECU from floppy. To run RCU, select Run Maintenance Program. Figure 7-1 Running a Utility from a Graphics Monitor AlphaBIOS Setup F1=Help...
  • Page 209: Running Utilities From A Serial Terminal

    Running Utilities Running Utilities from a Serial Terminal Utilities are run from a serial terminal in the same way as from a graphics monitor. The menus are the same; but, some keys are different. Table 7-1 AlphaBIOS Option Key Mapping AlphaBIOS Key VTxxx Key Ctrl/A...
  • Page 210: Running The Eisa Configuration Utility

    Running Utilities Running the EISA Configuration Utility The EISA Configuration Utility (ECU) is used to configure EISA options on DIGITAL Server systems. The ECU is run from a graphics monitor. 1. Start AlphaBIOS Setup. If the system is in the SRM console, issue the command alphabios.
  • Page 211: Running Raid Standalone Configuration Utility

    Running Utilities Running RAID Standalone Configuration Utility The RAID Standalone Configuration Utility is used to set up RAID disk drives and logical units. The Standalone Utility is run from the AlphaBIOS Utility menu. The DIGITAL Server 7300/7300R series system supports the KZPSC-xx PCI RAID controller (SWXCR).
  • Page 212: Updating Firmware

    Running Utilities Updating Firmware Use the Loadable Firmware Update (LFU) utility to update system firmware from an earlier version of AlphaBIOS. NOTE: If jumper J50 is removed, make sure it is reinserted before you start the upgrade procedure. Otherwise the firmware will not be upgraded.
  • Page 213 Running Utilities 4. When the upgrade is complete, issue the LFU exit command. The system is reset and you return to AlphaBIOS. If you press the Reset button instead of issuing the LFU exit command, the system is reset and you are returned to LFU The sections that follow show examples of updating firmware from the local CD-ROM, the local floppy, and a network device.
  • Page 214: Updating Firmware From The Internal Cd-Rom

    Running Utilities Updating Firmware from the Internal CD-ROM 1. Insert the CD-ROM with the updated firmware and select Upgrade AlphaBIOS from the main AlphaBIOS Setup screen. Use the Loadable Firmware Update (LFU) utility to perform the update. 2. Select the device from which firmware will be loaded. In this case, the choice is the internal CD-ROM.
  • Page 215: Updating Firmware From The Internal Floppy Disk

    Running Utilities Updating Firmware from the Internal Floppy Disk Creating firmware from a floppy disk requires two steps: • Creating the diskettes • Performing the update Creating the Diskettes To update system firmware from floppy disk, you first must create the firmware update diskettes.
  • Page 216: Figure 7-4 Standard Formatting

    Running Utilities Press F6. A dialog box displays, asking whether to perform a quick or standard format š (see Figure 7-3). If you select Quick Format, the formatting is completed immediately, but no bad sectors are mapped. If you select Standard Format, a dialog box similar to that in Figure 7-4 displays while the drive is formatted, showing the progress of the formatting.
  • Page 217 Running Utilities 3. Select the file that has the firmware update you want, or press Enter to select the default file. When the internal floppy disk is the load device, the file options are: AS4X00CP (default) - SRM console and AlphaBIOS console firmware only AS4X00IO - I/O adapter firmware only AS4X00FW is not available, since the file is too large to fit on a 1.44 MB diskette.
  • Page 218: Updating Firmware From A Network Device

    Running Utilities Updating Firmware from a Network Device The basic process of loading file from a network device is to: 1. Copy files to the local MOP server’s MOP load area. 2. Start LFU. 3. Select ewa0 as the load device. Before starting LFU, download the update files from the Internet (see the Preface of this document for the Internet address).
  • Page 219: Lfu Commands

    Running Utilities LFU Commands You can use the commands summarized in Table 7-3 to update system firmware. Table 7-3 LFU Command Summary Command Function display Shows the system physical configuration. exit Terminates the LFU program. help Displays the LFU command list. Restarts the LFU program.
  • Page 220 Running Utilities display The display command shows the system physical configuration. Display is equivalent to issuing the SRM console command show configuration. Because it shows the slot for each module, display can help you identify the location of a device. exit The exit command terminates the LFU program, causes system initialization and testing, and returns the system to the console from which LFU was called.
  • Page 221 Running Utilities list The list command displays the inventory of update firmware on the CD-ROM, network, or floppy. Only the devices listed at your terminal are supported for firmware updates. The list command shows three pieces of information for each device: •...
  • Page 222 Running Utilities 7–16 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 223: Srm Console Commands And Environment Variables

    SRM Console Commands and Environment Variables This chapter provides a summary of the SRM console commands and environment variables. It includes the following topics: • Summary of SRM Console Commands • Summary of SRM Environment Variables • Recording Environment Variables The test command is described in Chapter 3 of this document.
  • Page 224: Summary Of Srm Console Commands

    SRM Console Commands and Environment Variables Summary of SRM Console Commands The SRM console commands are used to examine or modify the system state. Table 8-1 Summary of SRM Console Commands Command Function alphabios Loads and starts the AlphaBIOS console. boot Loads and starts firmware upgrades.
  • Page 225 SRM Console Commands and Environment Variables Table 8-1 Summary of SRM Console Commands (Continued) Command Function Displays information about the specified console command. more Displays a file one screen at a time. prcache Initializes and displays status of the PCI NVRAM. set envar Sets or modifies the value of an environment variable.
  • Page 226: Summary Of Srm Environment Variables

    SRM Console Commands and Environment Variables Summary of SRM Environment Variables Environment variables pass configuration information between the console and the operating system. Their settings determine how the system powers up, boots the operating system, and operates. Environment variables are set or changed with the set envar command and returned to their default values with the clear envar command.
  • Page 227 SRM Console Commands and Environment Variables Table 8-2 Environment Variable Summary (Continued) Environment Function Variable ocp_text Overrides the default OCP display text with specified text. os_type Specifies the operating system and sets the appropriate console interface. Should always be set to nt. pci_parity Disables or enables parity checking on the PCI bus.
  • Page 228: Recording Environment Variables

    SRM Console Commands and Environment Variables Recording Environment Variables You can make copies of the table below to record environment variable settings for specific systems. Write the system name in the column provided. Enter the show* command to list the system settings. Table 8-3 Environment Variables Worksheet Environment System Name...
  • Page 229 SRM Console Commands and Environment Variables Table 8-3 Environment Variables Worksheet (Continued) Environment Variable System Name System Name System Name pk*0_soft_term sys_model_num sys_serial_num sys_type tga_sync_green tt_allow_login 8–7 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 230 SRM Console Commands and Environment Variables 8–8 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 231: Operating The System Remotely

    Operating the System Remotely This chapter describes how to use the remote console monitor (RCM) to monitor and control the system remotely. It includes the following topics: • RCM Console Overview • Modem Usage • Entering and Leaving Command Mode •...
  • Page 232: Rcm Console Overview

    Operating the System Remotely RCM Console Overview You use the remote console monitor (RCM) to monitor and control the system remotely. The RCM resides on the server control module and allows the system administrator to connect remotely to a managed system through a modem, using a serial terminal or terminal emulator.
  • Page 233: Modem Usage

    Operating the System Remotely Modem Usage To use the RCM to monitor a system remotely, first make the connections to the server control module, as shown below. Then configure the modem port for dial-in. Figure 9-1 RCM Connections ConsoleTerminal PhoneJack External Power Supply...
  • Page 234 Operating the System Remotely Modem Selection The RCM requires a Hayes-compatible modem. The controls that the RCM sends to the modem have been selected to be acceptable to a wide selection of modems. The modems that have been tested and qualified include: •...
  • Page 235 Operating the System Remotely Dialing In to the RCM Modem Port 1. Dial the modem connected to the server control module. The RCM answers the call and after a few seconds prompts for a password with a “#” character. 2. Enter the password that was loaded using the setpass command. You have three tries to correctly enter the password.
  • Page 236: Entering And Leaving Command Mode

    Operating the System Remotely Entering and Leaving Command Mode Use the default escape sequence to enter RCM command mode for the first time. You can enter RCM command mode from the SRM console level, the operating-system level, or an application. The RCM quit command reconnects the terminal to the system console port. Example 9-2 Entering and Leaving RCM Command Mode ^]^]rcm RCM>...
  • Page 237: Rcm Commands

    Operating the System Remotely RCM Commands The RCM commands summarized below are used to control and monitor a system remotely. Table 9-1 RCM Command Summary Command Function alert_clr Clears alert flag, stopping dial-out alert cycle alert_dis Disables the dial-out alert function alert_ena Enables the dial-out alert function disable...
  • Page 238 Operating the System Remotely Command Conventions • The commands are not case sensitive. • A command must be entered in full. • If a command is entered that is not valid, the command fails with the message: *** ERROR - unknown command *** Enter a valid command.
  • Page 239 Operating the System Remotely alert_clr The alert_clr command clears an alert condition within the RCM. The alert enable condition remains active, and the RCM will again enter the alert condition when it detects a system power failure. RCM>alert_clr alert_dis The alert_dis command disables RCM dial-out capability. It also clears any outstanding alerts.
  • Page 240 Operating the System Remotely disable The disable command disables remote access to the RCM modem port. RCM>disable The module’s remote access default state is DISABLED. The modem enable state is nonvolatile. When the modem is disabled, it remains disabled until the enable command is issued.
  • Page 241 Operating the System Remotely halt The halt command attempts to halt the managed system. It is functionally equivalent to pressing the Halt button on the system operator control panel to the “in” position and then releasing it to the “out” position. The RCM console firmware exits command mode and reconnects the user’s terminal to the server’s COM1 serial port.
  • Page 242 Operating the System Remotely “off” state of the DC On/Off button. If the system is already powered on, the poweron command has no effect. quit The quit command exits the user from command mode and reconnects the user’s terminal to the system console port. The following message is displayed: Focus returned to COM port The next display depends on what the system was doing when the RCM was invoked.
  • Page 243 Operating the System Remotely The following sample escape sequence consists of five iterations of the Ctrl key and the letter “o”. RCM>setesc ^o^o^o^o^o RCM> If the escape sequence entered exceeds 15 characters, the command fails with the message: *** ERROR *** When changing the default escape sequence, avoid using special characters that are used by the system’s terminal emulator or applications.
  • Page 244: Table 9-2 Rcm Status Command Fields

    Operating the System Remotely status The status command displays the current state of the server’s sensors, as well as the current escape sequence and alarm information. RCM>status Firmware Rev: V1.0 Escape Sequence: ^]^]RCM Remote Access: ENABLE/DISABLE Alerts: ENABLE/DISABLE Alert Pending: YES/NO (C) Temp (C): 26.0 RCM Power Control: ON/OFF External Power: ON...
  • Page 245: Dial-Out Alerts

    Operating the System Remotely Dial-Out Alerts The RCM can be configured to automatically dial out through the modem (usually to a paging service) when it detects a power failure within the system. When a dial-out alert is triggered, the RCM initializes the modem for dial-out, sends the dial-out string, hangs up the modem, and reconfigures the modem for dial-in.
  • Page 246 Operating the System Remotely Enabling the Dial-Out Alert Function: 1. Enter the set rcm_dialout command, followed by a dial-out alert string, from the SRM console (see in Error! Reference source not found.). 2. The string is a modem dial-out character string, not to exceed 47 characters, that is used by the RCM when dialing out through the modem.
  • Page 247 Operating the System Remotely Composing a Modem Dial-Out String The modem dial-out string emulates a user dialing an automatic paging service. Typically, the user dials the pager phone number, waits for a tone, and then enters a series of numbers. The RCM dial-out string (Example 9-5) has the following requirements: •...
  • Page 248: Resetting The Rcm To Factory Defaults

    Operating the System Remotely Resetting the RCM to Factory Defaults If the escape sequence has been forgotten, you can reset the controller to factory settings. Reset Procedure 1. Power down the DIGITAL Server system and access the server control module, as follows: Expose the PCI bus card cage.
  • Page 249: Troubleshooting Guide

    Operating the System Remotely Troubleshooting Guide Table 9-3 lists a number of possible causes and suggested solutions for symptoms you might see. Table 9-3 RCM Troubleshooting Symptom Possible Cause Suggested Solution The local terminal System and terminal baud rate Set the system and will not communi- set incorrectly.
  • Page 250 Operating the System Remotely Table 9-3 RCM Troubleshooting (Continued) Symptom Possible Cause Suggested Solution After the system and This delay is normal behavior. Wait a few seconds for the RCM are powered COM port to start working. up, the COM port seems to hang and then starts working after a few seconds.
  • Page 251 Operating the System Remotely Table 9-3 RCM Troubleshooting (Continued) Symptom Possible Cause Suggested Solution Cannot enable The modem is not configured Modify the modem modem or modem correctly to work with the RCM. initialization and/or answer will not answer. string. 9–21 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 252: Modem Dialog Details

    Operating the System Remotely Modem Dialog Details This section provides further details on the dialog between the RCM and the modem and is intended to help you reprogram your modem if necessary. Phases of Modem Operation The RCM is programmed to expect specific responses from the modem during four phases of operation: •...
  • Page 253 Operating the System Remotely This default initialization string works on a wide variety of modems. If your modem does not configure itself to these parameters, the initialization string will need to be modified. See the topic in this section entitled Modifying Initialization and Answer Strings. Ring Detection The RCM expects to be informed of an in-bound call by the modem signaling the RCM with the string, “2<cr>”...
  • Page 254: Table 9-4 Rcm/Modem Interchange Summary

    Operating the System Remotely RCM/Modem Interchange Overview Table 9-4 summarizes the actions between the RCM and the modem from initialization to hangup. Table 9-4 RCM/Modem Interchange Summary Action Data to Modem Data from Modem Initialization command AT&F0EVS0=0S12=50<cr> Initialization successful 0<cr> Phone line ringing 2<cr>...
  • Page 255 Operating the System Remotely To display all the RCM user settable strings: P00>>> show rcm* rcm_answer ATXA rcm_dialout rcm_init AT&F0EVS0=0S12=50 P00>>> Initialization and Answer String Substitutions The RCM default initialization and answer strings are as follows: Initialization String: “AT&F0EVS0=0S12=50” Answer String: “ATXA”...
  • Page 256 Operating the System Remotely 9–26 DIGITAL Server 7300/7300R Series Service Manual...
  • Page 257 Index B3030-EA memory module, 1–22, 6–4 B3030-FA memory module, 1–22, 6–4 ? command, RCM, 9–11 B3030-GA memory module, 1–22, 6–4 B3040-AA bridge module, 1–28, 6–4 B3050-AA PCI motherboard, 1–30 alert_clr command, RCM, 9–9 B3052-AA PCI motherboard, 1–30 alert_dis command, RCM, 9–9 B3105-AA CPU module, 1–18 alert_ena command, RCM, 9–9 B3105-CA CPU module, 1–18...
  • Page 258 Index fan tray fan fail detect module Halt button, 1–11 removal and replacement, Cover interlocks, 4–7 6–56 overriding, 4–7 power supply removal and replacement, 6–52 removal and replacement, 6–40 Cabinet system, 1–6 Cover interlocks, 1–4 power and fan LEDs, 3-4 CPU and bridge module LEDs, 3-2 power supply for remote access, 3-4 CPU LEDs, 3-3...
  • Page 259 Index Error registers, 5–5 7300/7300R power system, 6–9 exit command (LFU), 7–8, FRU part numbers, 6–4 7–12 to 7–14, 7–16 External Interface Address Register, 5–10 H7600-AA power controller, 1–7 External Interface Registers H7600-DB power controller, 1–7 loading and locking rules, 5–11 halt command, RCM, 9–11 Halts caused by power problem, 3-5...
  • Page 260 Index MC_ERR1 Register, 5–14 exit command, 7–16 Memory addressing, 1–24 rules, 1–25 update command, 7–17 Memory errors updating firmware from CD-ROM, 7–8 corrected read data error, 5–26 updating firmware from floppy read data substitute error, 5–26 disk, 7–9, 7–11 Memory module updating firmware from network variants, 1–22 device, 7–13...
  • Page 261 Index os_type environment variable, SRM, Power control module LEDs, 3-8 2–7 Power cords, internal, 6–5 Power faults, 4–9 Power harness PALcode, 2–24 removal and replacement, 6–36 PALcode, described, 5–31 Power problems PCI Error Status Register 1, 5–19 at power-up, 3-6 PCI I/O subsystem, 1–30 Power supply, 1–36 PCI master abort, 5–25...
  • Page 262 Index dial-out alerts, 9–15 entering and leaving command Safety guidelines, 6–2 mode, 9–6 Serial number, system, 6–25 modem usage, 9–3 restoring with set sys_serial_num, resetting to factory defaults, 9–18 6–26 troubleshooting, 9–19 Serial ports, 1–31 typical dialout command, 9–15 Server control module, 1–32 RCM commands removal and replacment, 6–30 ?, 9–11...
  • Page 263 Index System bus to PCI bus bridge module, Test mem command, 3-17 1–15, 1–28 Test pci command, 3-20 System bus to PCI/EISA bus bridge Troubleshooting module, 1–15 failures at power-up, 3-6 System consoles, 1–12 power problems, 3-5 System drawer using error logs, 5–2 7300, 1–3 System drawer components of, 1–3...
  • Page 264 Index Index–8 DIGITAL Server 7300/7300R Series Service Manual...

This manual is also suitable for:

7300r series

Table of Contents