Summary of Contents for Digital Equipment 7300 Series
Page 1
DIGITAL Server 7300/7300R Series Service Manual Part Number: EK-K9FWW-SG. A01 This manual is for anyone who services a DIGITAL Server 7300/7300R Series system. It covers installation, power-up, initial troubleshooting, and component installation. January 1998 Digital Equipment Corporation Maynard, Massachusetts...
Page 2
The software, if any, described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software or equipment that is not supplied by Digital Equipment Corporation or its affiliated companies.
Table of Contents 1 System Overview DIGITAL Server 7300/7300R System Drawer (BA30A) ............1–3 Cover Interlocks ......................1–4 Cabinet System........................1–6 Cabinet Differences ....................... 1–7 Cabinet System Fan Tray....................1–7 Pedestal System........................1–8 Control Panel and Drives....................1–10 System Consoles......................... 1–12 SRM Console.......................
Page 4
Server Control Module ....................... 1–31 Power Control Module ....................... 1–33 Power Supply ........................1–35 2 Power-Up Control Panel ........................2–2 Power-Up Sequence....................... 2–4 SROM Power-Up Test Flow....................2–8 SROM Errors Reported ...................... 2–11 XSROM Power-Up Test Flow .................... 2–12 XSROM Errors Reported....................2–15 Console Power-Up Tests ....................
Page 5
Power-Up/Down Sequence ....................4–8 Cabinet Power Configuration Rules..................4–10 Pedestal Power Configuration Rules (North America and Japan) ........4–12 Pedestal Power Configuration Rules (Europe and Asia Pacific) .......... 4–14 5 Error Detection with Error Registers Overview of Error Detection....................5–2 Error Registers........................
Page 6
System Bus to PCI Bus Bridge (B3040-AA) Module Removal and Replacement ....6–23 System Motherboard Removal and Replacement..............6–25 PCI/EISA Motherboard (B3050/B3052) Removal and Replacement........6–27 Server Control Module Removal and Replacement ............6–29 PCI/EISA Option Removal and Replacement ..............6–31 Power Supply Removal and Replacement................
Page 7
Resetting the RCM to Factory Defaults................ 9–18 Troubleshooting Guide ....................9–19 Modem Dialog Details....................9–22 Figures Figure 1-1 Components of the BA30A System Drawer ....... 1–3 Figure 1-2 Cover Interlock Circuit ............... 1–5 Figure 1-3 DIGITAL Server 7300/7300R Cabinet System ......1–6 Figure 1-4 Cabinet Fan Tray ................
Page 8
Figure 4-7 Pedestal Power Distribution (N.A. and Japan) ......4–12 Figure 4-8 Pedestal Power Distribution (Europe and AP)......4–14 Figure 5-1 Error Detector Placement ............5–2 Figure 6-1 System Drawer FRU Locations........... 6–3 Figure 6-2 Location of Power System FRUs ..........6–9 Figure 6-3 Exposing System Drawer (H9A10-EN &...
Page 9
Table 2-4 IOD Tests .................. 2–17 Table 2-5 PCI Motherboard Tests (B3050/B3052) ........2–18 Table 3-1 Power Control Module LED States ..........3-8 Table 5-1 External Interface Status Register ..........5–8 Table 5-2 Loading and Locking Rules for External Interface Registers..5–11 Table 5-3 MC Error Information Register 0 ..........
Page 10
DIGITAL Server 7300/7300R Series Service Manual...
Page 11
Preface Document Audience This manual is written for the customer service engineer. Document Structure This manual uses a structured documentation design. Topics are organized into small sections for efficient online and printed reference. Each topic begins with an abstract, followed by an illustration or example, and ends with descriptive text. This manual has nine chapters, as follows: •...
Page 12
• Chapter 8, SRM Console Commands and Environment Variables, summarizes the commands used to examine and alter the system configuration. • Chapter 9, Operating the System Remotely, describes how to use the remote console monitor (RCM) to monitor and control the system remotely. Documentation Titles The following table lists titles related to DIGITAL Server 7300/7300R series systems.
System Overview This chapter introduces the DIGITAL Server 7300/7300R series systems. These systems are available in cabinets or pedestals. The pedestal system has one system drawer and up to three StorageWorks shelves. The cabinet system can have a combination of system drawers and StorageWorks shelves that occupy the five sections of the cabinet.
Page 14
System Overview • Memory Modules • System Bus • System Bus to PCI Bus Bridge Module • PCI I/O Subsystem • Server Control Module • Power Control Module • Power Supply 1–2 DIGITAL Server 7300/7300R Series Service Manual...
System Overview DIGITAL Server 7300/7300R System Drawer (BA30A) Components in the BA30A system drawer are located in the system bus card cage, the PCI card cage, the control panel assembly, and the power and cooling section. The drawer measures 30 cm x 45 cm (11.8 in. x 17.7 in.) and fully configured weighs approximately 45.5 kg (~100 lbs).
System Overview PCI/EISA card cage, which holds the PCI motherboard, option cards, and server control module. Server control module, which holds the I/O connectors and remote console monitor. Control panel assembly, which includes the control panel, a floppy drive, and a CD- ROM drive.
System Overview Figure 1-2 Cover Interlock Circuit 3 Interlock 17-04217-01 Logic Switches Power Supply Cover Interlocks 17-04201-01 70-32016-01 17-04302-01 Motherboard DC_ENABLE_L 32016- B3040 B305n Switch 17-04196-01 POWER_FAULT_L To OCP 17-04201-02 RSM_DC_EN_L LJ-06315 NOTE: The cover interlocks must be engaged to enable power-up. To override the cover interlocks, find a suitable object to close the interlock circuit.
System Overview Cabinet System The DIGITAL Server 7300/7300R series cabinet system can accommodate multiple systems in a single cabinet. There are two cabinet variations that can hold different system configurations. From the outside, the cabinets look almost identical and are of one basic type.
System Overview Cabinet Differences Cabinet Power Mounting Destination H9A10-EN Two 120 volt Pull-out tray North America H7600-AA power (max drawers: 3) Asia Pacific controllers H9A10-EP Two 240 volt Pull-out tray Europe H7600-DB power (max drawers: 3) controllers Cabinet System Fan Tray At the top of cabinet systems is a fan tray containing three exhaust fans, a small 12-volt power supply, and a module that distributes power to the server control module in each drawer.
System Overview Pedestal System The pedestal system contains one system drawer with a control panel, a CD-ROM drive, and a floppy drive. In the pedestal control panel area there is space for an optional tape or disk drive. Three StorageWorks shelves provide up to 90 Gbytes of in-cabinet storage. Figure 1-5 Pedestal System Front PK-0301-96 In the pedestal system, the control panel is located at the top left in a tray.
System Overview Control Panel and Drives The control panel includes the On/Off, Halt, and Reset buttons and a display. In a pedestal system the control panel is located in a tray at the top of the system drawer. In a cabinet system, the control panel is at the bottom of the system drawer with the CD-ROM drive and the floppy drive.
Page 23
System Overview missing, regardless of the position of the On/Off button. Halt button. Pressing this button in (so the LED at the top of the button is on) has no effect on Windows NT. If the Halt button is in when the system is reset or powered up, the system halts in the SRM console.
System Overview System Consoles There are two console programs: the SRM console and the AlphaBIOS console. SRM Console The SRM console is a command-line interface that tests the system after power-up or reset and launches the AlphaBIOS graphical interface. For some configuration and diagnostic or testing tasks, you may need to use the SRM console interface rather than launch the AlphaBIOS console.
System Overview Figure 1-8 AlphaBIOS Boot Menu AlphaBIOS Version 5.12 Please select the operating system to start: Windows NT Server 3.51 to move the highlight to your choice. Press Enter to choose. Alpha Press <F2> to enter SETUP PK-0728-96 Environment Variables Environment variables are software parameters that, among other things, define the system configuration.
System Overview System Architecture Alpha microprocessor chips are used in these systems. The CPU, memory and the I/O bridge module are connected to the system bus motherboard. Figure 1-9 Architecture Diagram Memory CPU 0 Pairs System Bus 128-Bit Data Bus + 16 ECC and 40-Bit Command/Address Bus Bridge System to System to...
Page 27
System Overview DIGITAL Server 7300/7300R series systems use the Alpha chip for the CPU. The CPU, memory, and I/O bridge module to PCI/EISA I/O buses are connected to the system bus motherboard. A fourth type of module, the power control module, also plugs into the system motherboard.
System Overview System Motherboard The system motherboard is on the floor of the system card cage. It has slots for the CPU, memory, power control, and bridge modules. Figure 1-10 System Motherboard Module Locations PK-0703D-96 1–16 DIGITAL Server 7300/7300R Series Service Manual...
Page 29
System Overview The system motherboard has the logic for the system bus. It is the backplane that holds the CPU, memory, bridge, and power control modules. Figure 1-10 shows a diagram of the motherboard used in DIGITAL Server 7300/7300R series Server systems.
System Overview CPU Types DIGITAL Server 7300/7300R series systems can be configured with one of two CPU variants. CPU Variants Module Variant Clock Frequency Onboard Cache B3105-AA 400 MHz 4 Mbytes B3105-CA 533 MHz 4 Mbytes CPU Module Layout Figure 1-11 shows the layout of the CPU module. Figure 1-11 CPU Module Layout System Motherboard CPU Module Slots...
System Overview Alpha Chip Composition The Alpha chip is made using state-of-the-art chip technology, has a transistor count of 9.3 million, consumes 50 watts of power, and is air cooled (a fan is on the chip). The default cache system is write-back and when the module has an external cache, it is write-back. Chip Description Unit Description...
System Overview Memory Modules Memory modules are used only in pairs — two modules of the same size and type. Each module provides either the low half or the high half of the memory space. The 7300/7300R series system drawer can hold up to four memory module pairs. Figure 1-12 Memory Module Layout Typical S ynchronous M em ory Typica l E D O M em ory...
System Overview Memory Variants Each memory option consists of two identical modules. Each DIGITAL Server 7300/7300R series drawer supports up to four memory options, for a total of 4 Gbytes of memory. Memory modules are used only in pairs and are available in 128 Mbyte, 512 Mbyte, 1 Gbyte, and 2 Gbyte sizes.
Page 34
System Overview • The largest memory pair must be in slots MEM 0L and MEM 0H. • Other memory pairs must be the same size or smaller than the first memory pair. • Memory pairs must be installed in consecutive slots. 1–22 DIGITAL Server 7300/7300R Series Service Manual...
System Overview Memory Addressing Alpha system memory addressing is unusual because memory address space is determined not by the amount of physical memory but is calculated by a multiple of the size of the memory pair in slot MEM0x. Figure 1-13 How Memory Addressing Is Calculated 2028 M byte Fourth pair address space 512 M byte space em pty...
Page 36
System Overview The rules for addressing memory are as follows: • Address space is determined by the memory pair in slot MEM0. • Memory pairs need not be the same size. • The memory pair in slot MEM0 must be the largest of all memory pairs. Other memory pairs may be as large but none may be larger.
System Overview System Bus The system bus consists of a 40-bit command/address bus, a 128-bit plus ECC data bus, and several control signals and clocks. Figure 1-14 System Bus Block Diagram MEM3 MEM2 MEM1 MEM0 SIM_ADR DATA SYNC DRAMS CTRL MEM CTRL &...
Page 38
System Overview The system bus motherboard consists of a 40-bit command/address bus, a 128-bit plus ECC data bus, and several control signals, clocks, and a bus arbiter. The bus requires that all CPUs have the same high-speed oscillator providing the clock to the Alpha chip. The DIGITAL Server 7300/7300R series system bus connects up to four CPUs, four pairs of memory modules, and a single I/O bus bridge module.
System Overview System Bus to PCI Bus Bridge Module The bridge module is the physical interconnect between the system motherboard and any PCI motherboard in the system. Figure 1-15 Bridge Module PCI Bus Control AD<31:0> Address Control Data A to B bus ECC &...
Page 40
System Overview The system bus to PCI bus bridge module converts: • System bus commands and data addressed to I/O space to PCI commands and data • PCI bus commands and data addressed to system memory or CPUs to system bus commands and data.
System Overview PCI I/O Subsystem The I/O subsystem is PCI. The DIGITAL Server 7300/7300R series has two four-slot PCI buses that hold up to eight I/O options. One of these buses can be both PCI and EISA, but can hold not more than four options three of which may be EISA. Figure 1-16 PCI Block Diagram PCI-1 Bus SCSI Control...
Page 42
System Overview The logic for two PCI buses is on each PCI motherboard. PCI0 is a 64-bit bus with a built-in PCI to EISA bus bridge. PCI0 has one dedicated PCI slot and three slots, though there are six connectors, that can be PCI or EISA slots. Each slot has an EISA connector and a PCI connector only one of which may be used at a time.
System Overview Server Control Module The server control module enables remote console connections to the system drawer. The module passes signals to COM ports 1 and 2, the keyboard, and the mouse to the standard I/O connectors. Figure 1-17 Server Control Module Standard I/O Remote Console Monitor...
Page 44
System Overview The server control module has two sections: the remote console monitor (RCM) and the standard I/O. See Chapter 9 for information on controlling the system remotely. The remote console monitor connects to a modem through the modem port on the bulkhead.
System Overview Power Control Module The power control module controls power sequencing and monitors power supply voltage, temperature, and fans. Figure 1-18 Power Control Module System Motherboard Power Control Module Slot PK-0710-96 DIGITAL Server 7300/7300R Series Service Manual 1–33...
Page 46
System Overview The power control module performs the following functions: • Controls power sequencing. • Monitors the combined output of power supplies and shuts down power if it is not in range. • Monitors system temperature and shuts off power if it is out of range. •...
System Overview Power Supply The system drawer power supplies provide power only to components in the drawer. One or two power supplies are required, depending on the number of CPU modules and PCI card cages; a second or third can be added for redundancy. The power system is described in detail in Chapter 4.
Page 48
System Overview Description One to three power supplies provide power to components in the system drawer. (They supply power only for the drawer in which they are located.) Three power supplies provide redundant power in fully loaded DIGITAL Server 7300/7300R series systems. These power supplies share the load, and redundant configurations are supported.
Power-Up This chapter describes system power-up testing and explains the power-up displays. The following topics are covered: • Control Panel • Power-Up Sequence • SROM Power-Up Test Flow • SROM Errors Reported • XSROM Power-UP Test Flow • XSROM Errors Reported •...
Power-Up Control Panel The control panel display indicates the likely device when testing fails. Figure 2-1 Control Panel and LCD Display Potentiom eter A ccess H ole R eset H a lt O n/O ff P 0 T E S T 1 1 C P U 0 0 P K -07 0 6G -9 6 When the On/Off button LED is on, power is applied and the system is running.
Power-Up Table 2-1 Control Panel Display Field Content Display Meaning CPU number P0–P3 CPU reporting status Status TEST Tests are executing FAIL Failure has been detected MCHK Machine check has occurred INTR Error interrupt has occurred Test number ...
Power-Up Power-Up Sequence Console and most power-up tests reside on the I/O subsystem, not on the CPU nor on any other module on the system bus. Figure 2-2 Power-Up Flow X S R O M te sts execute Pow er-U p/R ese t S R O M code loa ded S R M con sole loaded into each C P U 's...
Power-Up XSROM. The XSROM, or extended SROM, contains back-up cache and memory tests, and a fail-safe loader. The XSROM code resides in sector 0 of FEPROM 0 on the XBUS. Sector 2 of FEPROM 0 contains a duplicate copy of the code and is used if sector 0 is bad. FEPROM.
Power-Up For the console to run, the path from the CPU to the XSROM must be functional. The XSROM resides in FEPROM0 on the XBUS, off the EISA bus, off PCI 0, off IOD 0. See Figure 2-4. This path is minimally tested by SROM. Figure 2-4 Console Code Critical Path M e m ory C P U...
Page 55
Power-Up The SROM contents are loaded into each CPU’s I-cache and executed on power-up/reset. After testing the caches on each processor chip, it tests the path to the XSROM. Once this path is tested and deemed reliable, layers of the XSROM are loaded sequentially into the processor chip on each CPU.
Power-Up SROM Power-Up Test Flow The SROM tests the CPU chip and the path to the XSROM. Figure 2-5 SROM Power-Up Test Flow Fo r e a c h C P U In itia lize C P U ch ip In itia lize Tu rn o ff C P U L E D P C I-E IS A b ridg e...
Page 57
Power-Up The Alpha chip built-in self-test tests the I-cache at power-up and upon reset. Each CPU chip loads its SROM code into its I-cache and starts executing it. If the chip is partially functional, the SROM code continues to execute. However, if the chip cannot perform most of its functions, that CPU hangs and that CPU pass/fail LED remains off.
Page 58
Power-Up Table 2-2 lists the tests performed by the SROM. Table 2-2 SROM Tests Test Name Logic Tested D-cache RAM March D-cache access, D-cache data, D-cache address logic test D-cache Tag RAM D-cache tag store RAM, D-cache bank address logic March test S-cache Data March S-cache RAM cells, S-cache data path, S-cache address path...
Power-Up SROM Errors Reported The SROM reports machine checks, pending interrupt/exception errors, and errors related to corruption of FEPROM 0. If SROM errors are fatal, the particular CPU will hang and only the CPU self-test pass LEDs and/or the LEDs on the system bus to PCI bus bridge module will indicate the failure.
Power-Up XSROM Power-Up Test Flow After the SROM has completed its tests and verified the path to the FEPROM containing the XSROM code, it loads the first 8 Kbytes of XSROM into the primary CPU’s S-cache and jumps to it. Figure 2-6 XSROM Power-Up Flowchart X S R O M b a n n e r to O C P /co n s o le d ev ice...
Power-Up After jumping to the primary CPU's S-cache, the code then intentionally I-caches itself and is completely register based (no D-stream for stack or data storage is used). The only D- stream accesses are writes/reads during testing. Each FEPROM has sixteen 64-Kbyte sectors. The first sector contains B-cache tests, memory tests, and a fail-safe loader.
Power-Up Table 2-3 Memory Tests Test Test Name Logic Tested Description Memory Data test Data path to and from 01 – FF Errors are memory reported as an 8-bit Data path on memory binary field. A set bit and RAMs indicates a module failure.
Power-Up XSROM Errors Reported The XSROM reports B-cache test errors and memory test errors. The XSROM also reports a warning if memory is illegally configured. Example 2-2 XSROM Errors Reported at Power-Up B-Cache Error (CPU Error) TEST ERR on cpu0 #CPU running the test cpu0 err#...
Page 64
Power-Up Sctr 1 -PAL headr CHKSM fail Sctr 1 -PAL code CHKSM fail Sctr 3 -CONSLE headr PTTRN fail Sctr 3 -CONSLE headr CHKSM fail Sctr 3 -CONSLE code CHKSM fail 2–16 DIGITAL Server 7300/7300R Series Service Manual...
Power-Up Console Power-Up Tests Once the SRM console is loaded, it does further testing of each IOD. Table 2-4 describes the IOD power-up tests, and Table 2-5 describes the PCI motherboard power-up tests. Table 2-4 IOD Tests Test Test Name Description Number IOD CSR Access test...
Power-Up Table 2-5 PCI Motherboard Tests (B3050/B3052) Test Test Name Diagnostic Description Number Name PCEB pceb_diag Tests the PCI to EISA bridge chip esc_diag Tests the EISA system controller 8K NVRAM nvram_diag Tests the NVRAM Real-Time Clock ds1287_diag Tests the real-time clock chip Keyboard and i8242_diag Tests the keyboard/mouse chip...
Power-Up Console Device Determination After the SROM and XSROM have completed their tasks, the SRM console program, as it starts, determines where to send its power-up messages. Figure 2-7 Console Device Determination Flowchart Pow e r-U p/R es et P 00> >> In it C o nso le Enva r C ons ole E nvar = graph ic s...
Power-Up Console Device Options The console device on a DIGITAL Server 7330/7300R series must be either a serial terminal connected to COM1 off the server control module set at 9600 baud or a graphics monitor off an adapter on PCI0. The console program must be AlphaBIOS. During power-up, the SROM and the XSROM always send progress and error messages to the OCP.
Power-Up Console Power-Up Display The last several lines of the power-up display prints appear on a graphics monitor and parts of it print to the control panel display. Example 2-3 Power-Up Display SROM V1.0 on cpu0 SROM V1.0 on cpu1 SROM V1.0 on cpu2 SROM V1.0 on cpu3 ...
Page 70
Power-Up At power-up or reset, the SROM code on each CPU module is loaded into that module’s I-cache and tests the module. If all tests pass, the processor’s LED lights. If any test fails, the LED remains off and power-up testing terminates on that CPU. The first determination of the primary processor is made, and the primary processor executes a loopback test to each PCI bridge.
Page 71
Power-Up Example 2-3 Power-Up Display (Continued) starting console on CPU 0 sizing memory 128 MB SYNC 128 MB SYNC starting console on CPU 1 starting console on CPU 2 starting console on CPU 3 probing IOD1 hose 1 bus 0 slot 1 - NCR 53C810 bus 0 slot 2 - DECchip 21041-AA bus 0 slot 3 - NCR 53C810...
Page 72
Power-Up The final primary CPU determination is made. The primary CPU unloads PALcode and decompression code from the FEPROM on the PCI 0 to its B-cache. The primary CPU then jumps to the PALcode to start the SRM console. The primary CPU prints a message indicating that it is running the console.
Power-Up Fail-Safe Loader The fail-safe loader is a software routine that loads the SRM console image from floppy. Once the console is running you will want to run LFU to update FEPROM 0 with a new image. NOTE: FEPROM 0 contains images of the SROM, XSROM, decompression, and SRM console code.
Page 74
Power-Up 2–26 DIGITAL Server 7300/7300R Series Service Manual...
Troubleshooting This chapter describes troubleshooting during power-up and booting, as well as diagnostics for DIGITAL Server 7300/7300R series systems. The chapter covers the following topics: • Troubleshooting with LEDs • Troubleshooting Power Problems • Troubleshooting with the Maintenance Bus (I2C Bus) •...
Troubleshooting Troubleshooting with LEDs During power-up, reset, initialization, or testing, diagnostics are run on CPUs, memories, bridge modules, PCI motherboards, and sometimes options. The following sections describe possible problems that can be identified by checking LEDs. Figure 3-1 CPU and Bridge Module LEDs Bridge Module LEDs CPU LEDs (IOD 0 &...
Troubleshooting Processor (CPU) LEDs If the CPU STP LED on any processor (CPU) module is lit, that CPU chip is functioning properly. If the CPU STP LED is off, that CPU may or may not be functioning. You can use the Halt button on the OCP to prevent the AlphaBIOS console (which turns off the CPU STP LED) from booting, thus assuring the validity of the CPU STP LED.
Troubleshooting Cabinet Power and Fan LEDs Figure 3-2 shows the cabinet power and fan LEDs. Figure 3-2 Cabinet Power and Fan LEDs Fan LED Power LED PK-0664-96 A cabinet system has three exhaust fans at the top of the cabinet. They are powered from a small power supply in the fan tray.
Troubleshooting Troubleshooting Power Problems Power problems can occur before the system is up or while the system is running. If a system stops running, make a habit of checking the PCM. Power Problem List Th e syste m w ill ha lt fo r the fo llow in g: 1.
Troubleshooting If Power Problem Occurs at Power-Up If the system has a power problem on a cold start, the PCM LEDs are not valid until after DCOK_SENSE has been asserted. The cause is one of the following: • Broken system fan •...
Troubleshooting Power Control Module LEDs The PCM has 11 LEDs visible through the system card cage. The LED display shows the relative placement of the LEDs. Figure 3-3 PCM LEDs DCOK_SENSE PS0_OK PS1_OK PS2_OK TEMP_OK CPUFAN_OK SYSFAN_OK CS_FAN0 CS_FAN1 CS_FAN2 C_FAN3 Normally On Tested at one-second intervals...
Troubleshooting Table 3-1 Power Control Module LED States State Description DCOK_SENSE Both +5.0V and +3.43V are present and within limits. PS0_OK Power supply 0 is present and has asserted POK_H. PS1_0K Power supply 1 is present and has asserted POK_H. Power supply 1 not present.
Troubleshooting Troubleshooting with the Maintenance Bus (I C Bus) The I C bus (referred to as the “I squared C bus”) is a small internal maintenance bus used to monitor system conditions scanned by the power control module, write the fault display, store error state, and track configuration information in the system.
Troubleshooting Monitoring System Conditions The I C bus monitors the state of system conditions scanned by the PCM. There are two registers on the PCM: One records the state of the fans and power supplies and is latched when there is a fault. The other causes an interrupt on the I C bus when a CPU or system fan fails, an over- temperature condition exists, or power supplied to the system is out of tolerance.
Troubleshooting Running Diagnostics — Test Command The test command runs diagnostics on the entire system, CPU devices, memory devices, and the PCI I/O subsystem. The test command runs only from the SRM console. Ctrl/C stops the test. Example 3-1 Test Command Syntax P00>>>...
Troubleshooting Testing an Entire System A test command with no modifiers runs all exercisers for subsystems and devices on the system. I/O devices tested are supported boot devices. The test runs for 10 minutes. Example 3-2 Sample Test Command P00>>> test Console is in diagnostic mode System test, runtime 600 seconds Type ^C to stop testing...
Page 87
Troubleshooting Starting processor/cache thrasher on each CPU.. Testing SCSI disks (read-only) No CD/ROM present, skipping embedded SCSI test Testing other SCSI devices (read-only).. Testing floppy drive (dva0, read-only) Program Device Pass Hard/Soft Bytes Written Bytes Read -------- ------------ ------------ ------ --------- ------------- ------------ 00003047 memtest memory 134217728...
Troubleshooting Testing Memory The test mem command tests individual memory devices or all memory. The test shown in Example 3-3 runs for 2 minutes. Example 3-3 Sample Test Memory Command P00>>> test memory Console is in diagnostic mode System test, runtime 120 seconds Type ^C to stop testing Starting background memory test, affinity to all CPUs..
Troubleshooting Testing PCI Buses and Devices The test pci command tests PCI buses and devices. The test runs for 2 minutes. Example 3-4 Sample Test Command for PCI P00>>> test pci* Console is in diagnostic mode System test, runtime 120 seconds Type ^C to stop testing Configuring all PCI buses..
Power System This chapter describes the DIGITAL Server 7300/7300R series power system: • Power Supply • Power Control Module Features • Power Circuit and Cover Interlocks • Power-Up/Down Sequence • Cabinet Power Configuration Rules • Pedestal Power Configuration Rules (North America and Japan) •...
Power System Power Supply Power supply outputs are shown in Figure 4-1. Figure 4-1 Power Supply Outputs M isc. S ignal C urrent share +5V /R eturn +3.4V /R eturn +3.4V /R eturn +12V /R etu rn P K W 0 40 2 A -96 4–2 DIGITAL Server 7300/7300R Series Service Manual...
Page 97
Power System Power Supply Features • 90–264 Vrms input • 450 watts output. Output voltages are as follows: Output Voltage Min. Voltage Max. Voltage Max. Current +5.0 4.85 5.25 +3.43 3.400 3.465 11.5 12.6 –12 –10.9 –13.2 –5.0 –4.6 –5.5 Vaux 0.05 •...
Power System Power Control Module Features The power control module (54-24117-01) is located behind the B3040-AA module, the system bus to PCI bus bridge module. Figure 4-2 Power Control Module System Motherboard Power Control Module Slot PK-0710-96 The power control module performs the following functions: 4–4 DIGITAL Server 7300/7300R Series Service Manual...
Page 99
Power System • Controls the power-up/down sequencing. • Monitors the combined output of power supplies VDD (3.43V) and VCC (5.0V) and asserts DCOK_SENSE if these voltages are within range and asserts POWER_FAULT_L causing an immediate power shutdown if either is not. •...
Power System Power Circuit and Cover Interlocks Figure 4-3 is a diagram of the power circuit. Note that B305n in the diagram stands for either the B3050-AA or B3052-AA PCI Motherboard. Figure 4-3 Power Circuit Diagram 17-04217-01 Logic Power Supply Cover Interlocks 17-04201-01...
Page 101
Power System Figure 4-3 shows the distribution of power throughout the system drawer. Opens in the circuit, the PCM signal POWER_FAULT_L or the SCM signal RSM_DC_EN_L interrupt DC power applied to the system. The opens can be caused by the On/Off button or by the cover interlocks. The POWER_FAULT_L signal is asserted by the PCM module if it detects a fault and the RSM_DC_EN_L is controlled remotely.
Power System Power-Up/Down Sequence The On/Off button can be controlled manually or remotely. The button is on the OCP. Remote power control is provided though the remote I/O port connected to the PCI. The power-up/down sequence flow is shown below. Figure 4-4 Power Up/Down Sequence Flowchart A p p ly A C Po w e r...
Page 103
Power System hard fault on power-up, the power supplies shut down immediately. If there is not a hard fault on power-up, the power system powers up and remains up until the system is shut off or the PCM senses a fault. If the PCM senses a power fault, the power system attempts to restore power and will restore power if the fault is not sensed a second time.
Power System Cabinet Power Configuration Rules There are different cabinets with different power delivery systems. See Chapter 1 for a description of the differences. A bar code label designating the cabinet variation is located inside the back door in the upper left corner of the bezel holding the door. The two variations are: H9A10-EN and H9A10-EP.
Power System Figure 4-6 shows an -EN three-drawer cabinet power configuration. The three-drawer -EN is shown with the H7600-AA controller. Figure 4-6 -EN Three Drawer Cabinet Power Configuration 2 Pow e r S y s te m D raw e r S y s te m D raw e r C o n tro lle rs 3 .6 7 A rm s...
Power System Pedestal Power Configuration Rules (North America and Japan) Figure 4-7 show pedestal power distribution in North America and Japan. Figure 4-7 Pedestal Power Distribution (N.A. and Japan) S torag eW orks S torageW orks Pow er S trips 0.75 A rm s 0.75 A rm s 0.75 A rm s...
Page 107
Power System Power Strip Single AC power strip supports one system drawer and one StorageWorks shelf. When two AC power strips are used, combined AC input line current cannot exceed the site circuit breaker restriction, assuming both strips are plugged in to the same circuit.
Power System Pedestal Power Configuration Rules (Europe and Asia Pacific) Figure 4-8 shows pedestal power distribution in Europe and Asia/Pacific. Figure 4-8 Pedestal Power Distribution (Europe and AP) Pow er S trips S torageW orks S tora geW orks 0 .34 A rm s 0 .34 A rm s 0 .34 A rm s S ystem D raw er...
Error Detection with Error Registers This chapter describes error detection with error registers. It includes the following topics: • Overview of Error Detection • Error Registers • Troubleshooting IOD-Detected Errors • Double Error Halts and Machine Checks While in PAL Mode DIGITAL Server 7300/7300R Series Service Manual 5–1...
Error Detection with Error Registers Overview of Error Detection Error detection is performed by CPUs, the IOD, and the EISA to PCI bus bridge. (The IOD is the acronym used by software to refer to the system bus to PCI bus bridge.) Figure 5-1 Error Detector Placement M em ory...
Page 111
Error Detection with Error Registers Lines Protected Device ECC Protected System bus data lines IOD on every transaction, CPU when using the bus B-cache IOD on every transaction, CPU when using the bus Parity Protected System bus command/address lines IOD on every transaction, CPU when using the bus Duplicate tag store IOD on every transaction,...
Page 112
Error Detection with Error Registers Internal EV5 or EV56 cache errors CPU B-cache module errors • System-dependent errors detected by both the CPU and IOD. These errors are system machine checks and are: CPU-detected external reference errors IOD hard error interrupts The IOD can detect hard errors on either side of the bridge.
Error Detection with Error Registers Error Registers The DIGITAL Server 7300/7300R include registers that hold error information that you can use for troubleshooting. These registers include: • External Interface Status Register – EI_STAT • External Interface Address Register - EI_ADDR •...
Page 114
Error Detection with Error Registers External Interface Status Register – EI_STAT The EI_STAT register is a read-only register that is unlocked and cleared by any PALcode read. Subject to some restrictions, a read of EI_STAT also unlocks the EL_ADDR, BC_TAG_ADDR, and FILL_SYN registers. EI_STAT is not unlocked or cleared by reset.
Page 115
Error Detection with Error Registers Fill data from B-cache or main memory can have correctable or non-correctable errors in ECC mode. In parity mode, fill data parity errors are treated as non-correctable hard errors. System address/command parity errors are always treated as non-correctable hard errors, irrespective of the mode.
Error Detection with Error Registers Table 5-1 External Interface Status Register Name Bits Type Description COR_ECC_ERR <31> Correctable ECC Error. Indicates that fill data received from outside the CPU contained a correctable ECC error. EI_ES <30> External Interface Error Source. When set, indicates that the error source is fill data from main memory or a system address/command parity error.
Page 117
Error Detection with Error Registers Table 5-1 External Interface Status Register (continued) Name Bits Type Description <63:36 All ones. > SEO_HRD_ERR <35> Second External Interface Hard Error. Indicates that a fill from B-cache or main memory, or a system address/command received by the CPU has a hard error while one of the hard error bits in the EI_STST register is already set.
Error Detection with Error Registers External Interface Address Register - EI_ADDR The EI_ADDR register contains the physical address associated with errors reported by the EI_STAT register. It is unlocked by a read of the EI_STAT Register. This register is meaningful only when one of the error bits is set. Address FF FFF0 0148 Access...
Error Detection with Error Registers Table 5-2 Loading and Locking Rules for External Interface Registers Correct Non-correct Second Load Lock Action When EI_STAT is -able Error -able Error Hard Error Register Register Read Clears and possible unlocks all registers Clears and possible unlocks all registers...
Error Detection with Error Registers MC Error Information Register 0 (MC_ERR0 - Offset = 800) The low-order MC bus (system bus) address bits are latched into this register when the system bus to PCI bus bridge detects an error event. If the event is a hard error, the register bits are locked.
Page 121
Error Detection with Error Registers MC Error Information Register 0 (MC_ERR0 - Offset = 800) The low-order MC bus (system bus) address bits are latched into this register when the system bus to PCI bus bridge detects an error event. If the event is a hard error, the register bits are locked.
Error Detection with Error Registers MC Error Information Register 1 (MC_ERR1 - Offset = 840) The high-order MC bus (system bus) address bits and error symptoms are latched into this register when the system bus to PCI bus bridge detects an error. If the event is a hard error, the register bits are locked.
Page 123
Error Detection with Error Registers Table 5-5 MC Error Information Register 1 Initial Name Bits Type State Description VALID <31> Logical OR of bits <30:23> in the CAP_ERR Register. Set if MC_ERR0 and MC_ERR1 contain a valid address. Reserved <30:21> Dirty <20>...
Error Detection with Error Registers CAP Error Register (CAP_ERR - Offset = 880) CAP_ERR is used to log information pertaining to an error detected by the CAP or MDP ASIC. If the error is a hard error, the register is locked. All bits, except the LOST_MC_ERR bit, are locked on hard errors.
Error Detection with Error Registers Table 5-6 CAP Error Register Name Bits Type Initial Description State MC_ERR VALID <31> Logical OR of bits <30:23> in this register. When set MC_ERR0 and MC_ERR1 are latched. RDSB <30> RW1C Non-correctable ECC error detected by MDPB.
Page 126
Error Detection with Error Registers Table 6-5 CAP Error Register (continued) Name Bits Type Initial Description State LOST_MC_ERR <24> RW1C Set when an error is detected but not logged because the associated symptom fields and registers are locked with the state of an earlier error.
Error Detection with Error Registers PCI Error Status Register 1 (PCI_ERR1 - Offset = 1040) PCI_ERR1 is used by the system bus to PCI bus bridge to log bus address <31:0> pertaining to an error condition logged in CAP_ERR. This register always captures PCI address <31:0>, even for a PCI DAC cycle.
Error Detection with Error Registers Troubleshooting IOD-Detected Errors Step 1 Read the CAP Error Registers on both PCI bridges (F9E0000880 and FBE0000880). If one or both of these registers shows an error, match the register contents with the data pattern and perform the action indicated.
Error Detection with Error Registers System Bus ECC Error Step 2 Read the MC_ERR1 register and match the contents with the data pattern. Perform the action indicated. Table 5-9 System Bus ECC Error Data Pattern MC_ERR1 Data Pattern Most Likely Cause Action For Memory Read 1000 0000 0000 xxxx xxxx 10xx 0xxx xxxx...
Error Detection with Error Registers System Bus Nonexistent Address Error Step 3 Determine which node (if any) should have responded to the command/address identified in MC_ERR1. Perform the action indicated. Table 5-10 System Bus Nonexistent Address Error Troubleshooting MC_ERR1 Data Pattern Most Likely Cause Action 1000 0000 000x xxxx xxxx xxxx 0xxx xxxx...
Error Detection with Error Registers System Bus Address Parity Error Step 4 Determine which node put the bad command/address on the system bus identified in MC_ERR1. Perform the action indicated. Table 5-11 Address Parity Error Troubleshooting MC_ERR1 Data Pattern Most Likely Cause Action 1000 0000 000x xxx0 10xx xxxx xxxx xxxx Data sourced by MID = 2...
Page 132
Error Detection with Error Registers PIO Buffer Overflow Error (PIO_OVFL) Step 5 Enter the value of the CAP_CTRL register bits<19:16> (Actual_PEND_NUM) in the following formula. Compare the results as indicated in Table 5-12 to determine the most likely cause of the error. When an IOD is implicated in the analysis of the error, replace the one that captured the error in its CAP Error Register.
Error Detection with Error Registers Page Table Entry Invalid Error Step 6 This error is almost always a software problem. However, if the software is known to be good and the hardware is suspected, swap the IOD. PCI Master Abort Step 7 Master aborts normally occur when the operating system is sizing the PCI bus.
Error Detection with Error Registers Broken Memory Step 10 Refer to the following sections. For a Read Data Substitute Error (Non-Correctable ECC Error) When a read data substitute (RDS) error occurs, determine which memory module pair caused the error as follows: 1.
Error Detection with Error Registers 3. When you have isolated the failing memory pair, determine which of the two modules is bad. (You cannot do this if the operating system is Windows NT.) Read the CPU FIL SYNDROME Register. If this register is non-zero, use the ECC syndrome bits in Table 5-13 to determine which module had the single-bit error.
Error Detection with Error Registers Command Codes Table 5-14 shows the codes for transactions on the system bus and how they are affected by the commander in charge of the bus during the transaction. The command is a six-bit field in the command address (bits<5:0>). Bit-to-text translations give six-bit data (although the top two bits may or may not be relevant).
Error Detection with Error Registers Double Error Halts and Machine Checks While in PAL Mode Two error cases require special attention: double error halts and machine checks while the machine is in PAL mode. Information is available that can help determine what error occurred.
Error Detection with Error Registers Double Error Halt A double error halt occurs under the following conditions: • A machine check occurs. • PAL completes its tasks and returns control of the system to the operating system. • A second machine check occurs before the operating system completes its tasks. The machine returns to the console and displays the following message: halt code = 6 double error halt...
Page 141
Error Detection with Error Registers The info 3 command (Example 5-1) causes the SRM console to read the “impure area,” which contains the state of the CPU before it entered PAL. Example 5-1 INFO 3 Command P00>>> info 3 cpu00 per_cpu impure area 00004400 cns$flag...
Removal and Replacement This chapter describes removal and replacement procedures for field-replaceable units (FRUs). It covers the following topics: • System Safety • FRU List • Power System FRUs • CPU Removal and Replacement • Memory Removal and Replacement • Power Supply Removal and Replacement •...
Removal and Replacement System Safety Observe the safety guidelines in this section to prevent personal injury. CAUTION: Wear an anti-static wrist strap whenever you work on a system. The DIGITAL Server 7300/7300R series cabinet system has a wrist strap connected to the frame at the front and rear. The pedestal system does not have an attached strap, so you will have to take one to the site.
Removal and Replacement FRU List Figure 6-1 shows the locations of FRUs in the system drawer. Table 6-1 lists the part numbers of all field-replaceable units. Figure 6-1 System Drawer FRU Locations M e m o r y M o d u le s C P U M o d u le s To p C o ve r O p tio n a l a n d N + 1...
Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers CPU Modules B3105-AA 400Mhz 4MB cached B3105-CA 533Mhz 4MB cached Memory Modules B3020-CA 64 Mbyte synch B3030-EA 256 Mbyte asynch (EDO) B3030-FA 512 Mbyte asynch (EDO) B3030-GA 2 Gbyte asynch (EDO) Required System Drawer Modules and Display 54-23803-01 System motherboard...
Page 153
Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) Power System Components 30-44712-01 Power supply (H7291-AA) 30-46788-01 Internal power source 40W/12V fan tray power (cabinet) H7600-AA Power controller (NA/Japan, H9A10-EN cabinet) H7600-DB Power controller (Europe/AP, H9A10-EP cabinet) 12-23501-01 NEMA power strip (N.A./Japan, pedestal) 12-45334-02 IEC power strip (Europe/AP, pedestal, and all cabinet systems)
Page 154
Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) System Drawer Cables and Jumpers From 17-04196-01 Server control Remote I/O SCM signal conn module signal signal conn on cable (60 pin) PCI mbrd 17-04199-01 Current share Current share Current share conn on PS1 cable conn on PS0 and PS2...
Page 155
Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) System Drawer Cables and Jumpers From 17-04292-01 SCSI CD-ROM CD-ROM CD-ROM sig conn sig cable conn on PCI mbrd 70-32016-01 Interlock switches Interlock Other OCP DC enable pwr and cable switch assy conn or pwr conn on ped tray pwr drive cable (17-...
Page 156
Removal and Replacement Table 6-1 Field-Replaceable Unit Part Numbers (continued) Pedestal Cables From 17-04293-01 Elec harness Power Ped tray bulkhead (system power harness side) cable+5/+12 (17-04217- 17-04302-01 OCP signal cable OCP sig conn OCP sig conn on ped tray on PCI mbrd bulkhead (system side) 17-04305-01 Harness power...
Removal and Replacement Power System FRUs Figure 6-2 Location of Power System FRUs Fan 0 Fan 1 Motherboard Fan 2 Fan Tray Cabinet B3040 To Pedestal B305n Power Source Floppy To Cabinet Pedestal Tray Power Source Tray Interlock SCSI Notes: Only power cables are shown.
Page 158
Removal and Replacement Part Number Description 17-04285-01 Power cord to power strip. .5 meter, IEC320 to IEC320 connector used in cabinet systems only. In pedestal systems, cords match country- specific wall outlets. H7600-AA Power controller used in place of 12-45334-02 and 17-04285-02 in the H9A10-EN cabinet in N.
Removal and Replacement System Drawer Exposure (Cabinet) There is one type of cabinet for these systems: the H9A10-EN/-EP cabinet. In the H9A10-EN and -EP Cabinet, the system drawer sits on a tray that slides out of the front of the cabinet. You must pull the stabilizer bar out from the bottom to prevent the cabinet from tipping over.
Page 160
Removal and Replacement CAUTION: The cabinet could tip over if a system drawer is pulled out and the stabilizing bar is not fully extended and its leveler foot on the floor. Exposing Any Section of the System Drawer in an H9A10-EN or -EP Cabinet. 1.
Removal and Replacement System Drawer Exposure (Pedestal) Figure 6-4 Exposing System Drawer (Pedestal) Pedestal Tray S ystem B us C over C over Pedestal Tray a nd Pow er S ection C over P C I B us C over 3.: DIGITAL Server 7300/7300R Series Service Manual 6–13...
Page 162
Removal and Replacement Exposing the System Drawer 1. Open the front door and remove it by lifting and pulling it away from the system. 2. Remove the top cover. Unscrew the two Phillips head screws midway up on each side of the pedestal, tilt the cover up, and lift it away from the frame.
Removal and Replacement CPU Removal and Replacement CAUTION: Two different CPU modules work in these systems: the B3107-AA and the B3107-CA. Unless you are upgrading, be sure you are replacing the broken module with the same variant. Figure 6-5 Removing a CPU Module C P U M odu le S yste m B us C ard C age...
Page 164
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement CPU Fan Removal and Replacement Figure 6-6 Removing CPU Fan P K W 4 1 1 A -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–17...
Page 166
Removal and Replacement Removal 1. Follow the CPU Removal and Replacement procedure. 2. Unplug the fan from the module. 3. Remove the four Phillips head screws holding the fan to the Alpha chip’s heat sink. Replacement Reverse the above procedure. Verification If the system powers up, the CPU fan is working.
Removal and Replacement Memory Removal and Replacement CAUTION: Several different memory modules work in these systems. Be sure you are replacing the broken module with the same variant. Figure 6-7 Removing a Memory Module M em ory M odu le S ystem B us C ard C age P K W 04 0 8-9 6...
Page 168
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement Power Control Module Removal and Replacement Figure 6-8 Removing Power Control Module Pow er C o ntrol M odule (P C M ) 6\VWHP %XV &DUG &DJH P K W 04 12 -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–21...
Page 170
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement System Bus to PCI Bus Bridge (B3040-AA) Module Removal and Replacement Figure 6-9 Removing System Bus to PCI/EISA Bus Bridge Module P K W 0 4 13 -96 DIGITAL Server 7300/7300R Series Service Manual 6–23...
Page 172
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement System Motherboard Removal and Replacement The system motherboard contains an NVRAM that holds the system serial number. Be sure to record this number before replacing the module. The serial number is on a bar code on the side of the system drawer or on the system bus card cage. The part number is 54-23803-01.
Page 174
Removal and Replacement 5. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 6. Remove all the PCI/EISA options. 7. Remove the server control module. 8. Remove the PCI motherboard. 9.
Removal and Replacement PCI/EISA Motherboard (B3050/B3052) Removal and Replacement Figure 6-11 Replacing PCI/EISA Motherboard C o nnection to B ridge M odule P C I M otherboard P K W 0 4 09 -96 Removal The PCI motherboard contains an NVRAM with ECU data and customized console environment variables.
Page 176
Removal and Replacement 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement Server Control Module Removal and Replacement Figure 6-12 Removing Server Control Module S CM B ulkhead C onnectors Keyboard CO M 1 Parallel 12V D C M ous e C O M 2 M odem P C I M otherboard C onnectors D is kette D rive O C P R em ote I/O...
Page 178
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement PCI/EISA Option Removal and Replacement Figure 6-13 Removing PCI/EISA Option PKW0418-96 WARNING: To prevent fire, use only modules with current limited outputs. See National Electrical Code NFPA 70 or Safety of Information Technology Equipment, Including Electrical Business Equipment EN 60 950.
Page 180
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement Power Supply Removal and Replacement Figure 6-14 Removing Power Supply Jumper 17-04199-01 Cable Harness 17-04217-01 ML014295 DIGITAL Server 7300/7300R Series Service Manual 6–33...
Page 182
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the cover to the power section of the drawer. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
Removal and Replacement Power Harness Removal and Replacement Figure 6-15 Removing Power Harness Holding Bracket System Bus Motherboard Fans Power Supplies PCI Bus Motherboard OCP Tray To OCP PCI Bus Motherboard CD-ROM To Floppy To OCP OCP Tray PKW0419-96 DIGITAL Server 7300/7300R Series Service Manual 6–35...
Page 184
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the power, system card cage, and PCI/EISA sections of the drawer by removing all covers. Unscrew the Phillips head screws holding each cover in place and slide the covers off the drawer.
Removal and Replacement System Drawer Fan Removal and Replacement Figure 6-16 Removing System Drawer Fan P K W 04 1 6-9 6 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the power system, the system card cage, and the PCI card cage sections of the drawer by removing all three covers.
Page 186
Removal and Replacement 4. Release the power supply tray by removing the two Phillips head screws on the side of the drawer. 5. Lift the power supply tray to release it from the sheet metal and slide it out from the drawer.
Removal and Replacement Cover Interlock Removal and Replacement Figure 6-17 Removing Cover Interlocks 3 C over Interlock S w itches 70-32016-01 To O C P P K W -04 0 3D -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–39...
Page 188
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove all three section covers to expose the interlock switch assembly. 4. Remove the two screws holding the interlock in place. 5.
Removal and Replacement Operator Control Panel Removal and Replacement (Cabinet) Figure 6-18 Removing OCP (Cabinet) P K W 0 4 1 7 C -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–41...
Page 190
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. While you need not remove the tray containing the OCP, you do need to slide it forward to access the OCP retaining screws under the tray. The tray is attached to the power system section cover.
Removal and Replacement Operator Control Panel Removal and Replacement (Pedestal) Figure 6-19 Removing OCP (Pedestal) 3.: DIGITAL Server 7300/7300R Series Service Manual 6–43...
Page 192
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer far enough to disconnect cables attached to the OCP, the floppy, and the CD-ROM drive.
Removal and Replacement Floppy Removal and Replacement Figure 6-20 Removing Floppy Drive P K W 0 4 17 B -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–45...
Page 194
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer and disconnect cables attached to the OCP (unnecessary on a pedestal system), the floppy, and the CD-ROM drive.
Removal and Replacement CD-ROM Removal and Replacement Figure 6-21 Removing CD-ROM P K W 04 1 7A -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–47...
Page 196
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer and disconnect cables attached to the OCP (unnecessary on a pedestal system), the floppy, and the CD-ROM drive.
Removal and Replacement Cabinet Fan Tray Removal and Replacement Figure 6-22 Removing Cabinet Fan Tray Fan L ED Power LE D Power To S C M P K W 0 4 4 1 A -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–49...
Page 198
Removal and Replacement Removal 1. Shut down the operating system and power down the system. Unplug the AC power cable from the cabinet tray power supply. 2. If present, unplug any power cables going to the server control modules at the back of system drawers.
Removal and Replacement Cabinet Fan Tray Power Supply Removal and Replacement Figure 6-23 Removing Cabinet Fan Tray Power Supply G round To fan fail detect board O ffsets To fans Power supply Power supply P K W 0 4 4 1 B -9 6 cover DIGITAL Server 7300/7300R Series Service Manual 6–51...
Page 200
Removal and Replacement Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan fail detect module and each fan. 3. Remove the power supply cover. It is held in place by two screws that go through the AC bulkhead spot welded to the tray weldment.
Removal and Replacement Cabinet Fan Tray Fan Removal and Replacement Figure 6-24 Removing Cabinet Fan Tray Fan P K W 0 44 1F -96 DIGITAL Server 7300/7300R Series Service Manual 6–53...
Page 202
Removal and Replacement Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan you wish to replace. 3. Remove the fan finger guard. 4. Remove the two remaining screws holding the fan to the tray and remove the fan. 5.
Removal and Replacement Cabinet Fan Tray Fan Fail Detect Module Removal and Replacement Figure 6-25 Removing Fan Tray Fan Fail Detect Module P K W 04 4 1D -9 6 DIGITAL Server 7300/7300R Series Service Manual 6–55...
Page 204
Removal and Replacement Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan fail detect module. 3. Remove the fan fail detect module. In early systems, the module is held in place by three screws that go through the weldment, through three standoffs, through the module to nuts.
Removal and Replacement StorageWorks Shelf Removal and Replacement Figure 6-26 Removing StorageWorks Shelf Cabinet S torageW orks S helf M ountin g R ails S tora geW orks S helf (H 910A -E C ) M ounting R ails (H 910A -E B ) Pedestal P K W 0 4 51 -96 DIGITAL Server 7300/7300R Series Service Manual 6–57...
Page 206
Removal and Replacement Removal 1. Shut down the operating system and power down the system. 2. Remove the power cord and signal cord(s) from the StorageWorks shelf. 3. Remove the two retaining brackets holding the shelf in the mounting rail by removing the Phillips head screws holding the brackets in place.
Running Utilities This chapter provides a brief overview of how to load and run utilities. The following topics are covered: • Selecting Utilities from the AlphaBIOS Menu • Running Utilities from a Serial Terminal • Running the EISA Configuration Utility •...
Running Utilities Selecting Utilities from the AlphaBIOS Menu Start AlphaBIOS and select Utilities from the menu. The next selection depends on the utility to be run. For example, to run ECU, select Run ECU from floppy. To run RCU, select Run Maintenance Program. Figure 7-1 Running a Utility from a Graphics Monitor AlphaBIOS Setup F1=Help...
Running Utilities Running Utilities from a Serial Terminal Utilities are run from a serial terminal in the same way as from a graphics monitor. The menus are the same; but, some keys are different. Table 7-1 AlphaBIOS Option Key Mapping AlphaBIOS Key VTxxx Key Ctrl/A...
Running Utilities Running the EISA Configuration Utility The EISA Configuration Utility (ECU) is used to configure EISA options on DIGITAL Server systems. The ECU is run from a graphics monitor. 1. Start AlphaBIOS Setup. If the system is in the SRM console, issue the command alphabios.
Running Utilities Running RAID Standalone Configuration Utility The RAID Standalone Configuration Utility is used to set up RAID disk drives and logical units. The Standalone Utility is run from the AlphaBIOS Utility menu. The DIGITAL Server 7300/7300R series system supports the KZPSC-xx PCI RAID controller (SWXCR).
Running Utilities Updating Firmware Use the Loadable Firmware Update (LFU) utility to update system firmware from an earlier version of AlphaBIOS. NOTE: If jumper J50 is removed, make sure it is reinserted before you start the upgrade procedure. Otherwise the firmware will not be upgraded.
Page 213
Running Utilities 4. When the upgrade is complete, issue the LFU exit command. The system is reset and you return to AlphaBIOS. If you press the Reset button instead of issuing the LFU exit command, the system is reset and you are returned to LFU The sections that follow show examples of updating firmware from the local CD-ROM, the local floppy, and a network device.
Running Utilities Updating Firmware from the Internal CD-ROM 1. Insert the CD-ROM with the updated firmware and select Upgrade AlphaBIOS from the main AlphaBIOS Setup screen. Use the Loadable Firmware Update (LFU) utility to perform the update. 2. Select the device from which firmware will be loaded. In this case, the choice is the internal CD-ROM.
Running Utilities Updating Firmware from the Internal Floppy Disk Creating firmware from a floppy disk requires two steps: • Creating the diskettes • Performing the update Creating the Diskettes To update system firmware from floppy disk, you first must create the firmware update diskettes.
Running Utilities Press F6. A dialog box displays, asking whether to perform a quick or standard format (see Figure 7-3). If you select Quick Format, the formatting is completed immediately, but no bad sectors are mapped. If you select Standard Format, a dialog box similar to that in Figure 7-4 displays while the drive is formatted, showing the progress of the formatting.
Page 217
Running Utilities 3. Select the file that has the firmware update you want, or press Enter to select the default file. When the internal floppy disk is the load device, the file options are: AS4X00CP (default) - SRM console and AlphaBIOS console firmware only AS4X00IO - I/O adapter firmware only AS4X00FW is not available, since the file is too large to fit on a 1.44 MB diskette.
Running Utilities Updating Firmware from a Network Device The basic process of loading file from a network device is to: 1. Copy files to the local MOP server’s MOP load area. 2. Start LFU. 3. Select ewa0 as the load device. Before starting LFU, download the update files from the Internet (see the Preface of this document for the Internet address).
Running Utilities LFU Commands You can use the commands summarized in Table 7-3 to update system firmware. Table 7-3 LFU Command Summary Command Function display Shows the system physical configuration. exit Terminates the LFU program. help Displays the LFU command list. Restarts the LFU program.
Page 220
Running Utilities display The display command shows the system physical configuration. Display is equivalent to issuing the SRM console command show configuration. Because it shows the slot for each module, display can help you identify the location of a device. exit The exit command terminates the LFU program, causes system initialization and testing, and returns the system to the console from which LFU was called.
Page 221
Running Utilities list The list command displays the inventory of update firmware on the CD-ROM, network, or floppy. Only the devices listed at your terminal are supported for firmware updates. The list command shows three pieces of information for each device: •...
Page 222
Running Utilities 7–16 DIGITAL Server 7300/7300R Series Service Manual...
SRM Console Commands and Environment Variables This chapter provides a summary of the SRM console commands and environment variables. It includes the following topics: • Summary of SRM Console Commands • Summary of SRM Environment Variables • Recording Environment Variables The test command is described in Chapter 3 of this document.
SRM Console Commands and Environment Variables Summary of SRM Console Commands The SRM console commands are used to examine or modify the system state. Table 8-1 Summary of SRM Console Commands Command Function alphabios Loads and starts the AlphaBIOS console. boot Loads and starts firmware upgrades.
Page 225
SRM Console Commands and Environment Variables Table 8-1 Summary of SRM Console Commands (Continued) Command Function Displays information about the specified console command. more Displays a file one screen at a time. prcache Initializes and displays status of the PCI NVRAM. set envar Sets or modifies the value of an environment variable.
SRM Console Commands and Environment Variables Summary of SRM Environment Variables Environment variables pass configuration information between the console and the operating system. Their settings determine how the system powers up, boots the operating system, and operates. Environment variables are set or changed with the set envar command and returned to their default values with the clear envar command.
Page 227
SRM Console Commands and Environment Variables Table 8-2 Environment Variable Summary (Continued) Environment Function Variable ocp_text Overrides the default OCP display text with specified text. os_type Specifies the operating system and sets the appropriate console interface. Should always be set to nt. pci_parity Disables or enables parity checking on the PCI bus.
SRM Console Commands and Environment Variables Recording Environment Variables You can make copies of the table below to record environment variable settings for specific systems. Write the system name in the column provided. Enter the show* command to list the system settings. Table 8-3 Environment Variables Worksheet Environment System Name...
Page 229
SRM Console Commands and Environment Variables Table 8-3 Environment Variables Worksheet (Continued) Environment Variable System Name System Name System Name pk*0_soft_term sys_model_num sys_serial_num sys_type tga_sync_green tt_allow_login 8–7 DIGITAL Server 7300/7300R Series Service Manual...
Page 230
SRM Console Commands and Environment Variables 8–8 DIGITAL Server 7300/7300R Series Service Manual...
Operating the System Remotely This chapter describes how to use the remote console monitor (RCM) to monitor and control the system remotely. It includes the following topics: • RCM Console Overview • Modem Usage • Entering and Leaving Command Mode •...
Operating the System Remotely RCM Console Overview You use the remote console monitor (RCM) to monitor and control the system remotely. The RCM resides on the server control module and allows the system administrator to connect remotely to a managed system through a modem, using a serial terminal or terminal emulator.
Operating the System Remotely Modem Usage To use the RCM to monitor a system remotely, first make the connections to the server control module, as shown below. Then configure the modem port for dial-in. Figure 9-1 RCM Connections ConsoleTerminal PhoneJack External Power Supply...
Page 234
Operating the System Remotely Modem Selection The RCM requires a Hayes-compatible modem. The controls that the RCM sends to the modem have been selected to be acceptable to a wide selection of modems. The modems that have been tested and qualified include: •...
Page 235
Operating the System Remotely Dialing In to the RCM Modem Port 1. Dial the modem connected to the server control module. The RCM answers the call and after a few seconds prompts for a password with a “#” character. 2. Enter the password that was loaded using the setpass command. You have three tries to correctly enter the password.
Operating the System Remotely Entering and Leaving Command Mode Use the default escape sequence to enter RCM command mode for the first time. You can enter RCM command mode from the SRM console level, the operating-system level, or an application. The RCM quit command reconnects the terminal to the system console port. Example 9-2 Entering and Leaving RCM Command Mode ^]^]rcm RCM>...
Operating the System Remotely RCM Commands The RCM commands summarized below are used to control and monitor a system remotely. Table 9-1 RCM Command Summary Command Function alert_clr Clears alert flag, stopping dial-out alert cycle alert_dis Disables the dial-out alert function alert_ena Enables the dial-out alert function disable...
Page 238
Operating the System Remotely Command Conventions • The commands are not case sensitive. • A command must be entered in full. • If a command is entered that is not valid, the command fails with the message: *** ERROR - unknown command *** Enter a valid command.
Page 239
Operating the System Remotely alert_clr The alert_clr command clears an alert condition within the RCM. The alert enable condition remains active, and the RCM will again enter the alert condition when it detects a system power failure. RCM>alert_clr alert_dis The alert_dis command disables RCM dial-out capability. It also clears any outstanding alerts.
Page 240
Operating the System Remotely disable The disable command disables remote access to the RCM modem port. RCM>disable The module’s remote access default state is DISABLED. The modem enable state is nonvolatile. When the modem is disabled, it remains disabled until the enable command is issued.
Page 241
Operating the System Remotely halt The halt command attempts to halt the managed system. It is functionally equivalent to pressing the Halt button on the system operator control panel to the “in” position and then releasing it to the “out” position. The RCM console firmware exits command mode and reconnects the user’s terminal to the server’s COM1 serial port.
Page 242
Operating the System Remotely “off” state of the DC On/Off button. If the system is already powered on, the poweron command has no effect. quit The quit command exits the user from command mode and reconnects the user’s terminal to the system console port. The following message is displayed: Focus returned to COM port The next display depends on what the system was doing when the RCM was invoked.
Page 243
Operating the System Remotely The following sample escape sequence consists of five iterations of the Ctrl key and the letter “o”. RCM>setesc ^o^o^o^o^o RCM> If the escape sequence entered exceeds 15 characters, the command fails with the message: *** ERROR *** When changing the default escape sequence, avoid using special characters that are used by the system’s terminal emulator or applications.
Operating the System Remotely status The status command displays the current state of the server’s sensors, as well as the current escape sequence and alarm information. RCM>status Firmware Rev: V1.0 Escape Sequence: ^]^]RCM Remote Access: ENABLE/DISABLE Alerts: ENABLE/DISABLE Alert Pending: YES/NO (C) Temp (C): 26.0 RCM Power Control: ON/OFF External Power: ON...
Operating the System Remotely Dial-Out Alerts The RCM can be configured to automatically dial out through the modem (usually to a paging service) when it detects a power failure within the system. When a dial-out alert is triggered, the RCM initializes the modem for dial-out, sends the dial-out string, hangs up the modem, and reconfigures the modem for dial-in.
Page 246
Operating the System Remotely Enabling the Dial-Out Alert Function: 1. Enter the set rcm_dialout command, followed by a dial-out alert string, from the SRM console (see in Error! Reference source not found.). 2. The string is a modem dial-out character string, not to exceed 47 characters, that is used by the RCM when dialing out through the modem.
Page 247
Operating the System Remotely Composing a Modem Dial-Out String The modem dial-out string emulates a user dialing an automatic paging service. Typically, the user dials the pager phone number, waits for a tone, and then enters a series of numbers. The RCM dial-out string (Example 9-5) has the following requirements: •...
Operating the System Remotely Resetting the RCM to Factory Defaults If the escape sequence has been forgotten, you can reset the controller to factory settings. Reset Procedure 1. Power down the DIGITAL Server system and access the server control module, as follows: Expose the PCI bus card cage.
Operating the System Remotely Troubleshooting Guide Table 9-3 lists a number of possible causes and suggested solutions for symptoms you might see. Table 9-3 RCM Troubleshooting Symptom Possible Cause Suggested Solution The local terminal System and terminal baud rate Set the system and will not communi- set incorrectly.
Page 250
Operating the System Remotely Table 9-3 RCM Troubleshooting (Continued) Symptom Possible Cause Suggested Solution After the system and This delay is normal behavior. Wait a few seconds for the RCM are powered COM port to start working. up, the COM port seems to hang and then starts working after a few seconds.
Page 251
Operating the System Remotely Table 9-3 RCM Troubleshooting (Continued) Symptom Possible Cause Suggested Solution Cannot enable The modem is not configured Modify the modem modem or modem correctly to work with the RCM. initialization and/or answer will not answer. string. 9–21 DIGITAL Server 7300/7300R Series Service Manual...
Operating the System Remotely Modem Dialog Details This section provides further details on the dialog between the RCM and the modem and is intended to help you reprogram your modem if necessary. Phases of Modem Operation The RCM is programmed to expect specific responses from the modem during four phases of operation: •...
Page 253
Operating the System Remotely This default initialization string works on a wide variety of modems. If your modem does not configure itself to these parameters, the initialization string will need to be modified. See the topic in this section entitled Modifying Initialization and Answer Strings. Ring Detection The RCM expects to be informed of an in-bound call by the modem signaling the RCM with the string, “2<cr>”...
Operating the System Remotely RCM/Modem Interchange Overview Table 9-4 summarizes the actions between the RCM and the modem from initialization to hangup. Table 9-4 RCM/Modem Interchange Summary Action Data to Modem Data from Modem Initialization command AT&F0EVS0=0S12=50<cr> Initialization successful 0<cr> Phone line ringing 2<cr>...
Page 255
Operating the System Remotely To display all the RCM user settable strings: P00>>> show rcm* rcm_answer ATXA rcm_dialout rcm_init AT&F0EVS0=0S12=50 P00>>> Initialization and Answer String Substitutions The RCM default initialization and answer strings are as follows: Initialization String: “AT&F0EVS0=0S12=50” Answer String: “ATXA”...
Page 256
Operating the System Remotely 9–26 DIGITAL Server 7300/7300R Series Service Manual...
Page 258
Index fan tray fan fail detect module Halt button, 1–11 removal and replacement, Cover interlocks, 4–7 6–56 overriding, 4–7 power supply removal and replacement, 6–52 removal and replacement, 6–40 Cabinet system, 1–6 Cover interlocks, 1–4 power and fan LEDs, 3-4 CPU and bridge module LEDs, 3-2 power supply for remote access, 3-4 CPU LEDs, 3-3...
Page 259
Index Error registers, 5–5 7300/7300R power system, 6–9 exit command (LFU), 7–8, FRU part numbers, 6–4 7–12 to 7–14, 7–16 External Interface Address Register, 5–10 H7600-AA power controller, 1–7 External Interface Registers H7600-DB power controller, 1–7 loading and locking rules, 5–11 halt command, RCM, 9–11 Halts caused by power problem, 3-5...
Page 260
Index MC_ERR1 Register, 5–14 exit command, 7–16 Memory addressing, 1–24 rules, 1–25 update command, 7–17 Memory errors updating firmware from CD-ROM, 7–8 corrected read data error, 5–26 updating firmware from floppy read data substitute error, 5–26 disk, 7–9, 7–11 Memory module updating firmware from network variants, 1–22 device, 7–13...
Page 261
Index os_type environment variable, SRM, Power control module LEDs, 3-8 2–7 Power cords, internal, 6–5 Power faults, 4–9 Power harness PALcode, 2–24 removal and replacement, 6–36 PALcode, described, 5–31 Power problems PCI Error Status Register 1, 5–19 at power-up, 3-6 PCI I/O subsystem, 1–30 Power supply, 1–36 PCI master abort, 5–25...
Page 262
Index dial-out alerts, 9–15 entering and leaving command Safety guidelines, 6–2 mode, 9–6 Serial number, system, 6–25 modem usage, 9–3 restoring with set sys_serial_num, resetting to factory defaults, 9–18 6–26 troubleshooting, 9–19 Serial ports, 1–31 typical dialout command, 9–15 Server control module, 1–32 RCM commands removal and replacment, 6–30 ?, 9–11...
Page 263
Index System bus to PCI bus bridge module, Test mem command, 3-17 1–15, 1–28 Test pci command, 3-20 System bus to PCI/EISA bus bridge Troubleshooting module, 1–15 failures at power-up, 3-6 System consoles, 1–12 power problems, 3-5 System drawer using error logs, 5–2 7300, 1–3 System drawer components of, 1–3...
Page 264
Index Index–8 DIGITAL Server 7300/7300R Series Service Manual...