Digital Equipment AlphaServer 4000 Service Manual

Hide thumbs Also See for AlphaServer 4000:
Table of Contents

Advertisement

Quick Links

AlphaServer 4000/4100
Service Manual
Order Number:
EK–4100A–SV. B01
This manual is for anyone who services an AlphaServer 4000/4100
pedestal or cabinet system. It includes troubleshooting information,
configuration rules, and instructions for removal and replacement of
field-replaceable units (FRUs).
Digital Equipment Corporation
Maynard, Massachusetts

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the AlphaServer 4000 and is the answer not in the manual?

Questions and answers

Subscribe to Our Youtube Channel

Summary of Contents for Digital Equipment AlphaServer 4000

  • Page 1 AlphaServer 4000/4100 Service Manual Order Number: EK–4100A–SV. B01 This manual is for anyone who services an AlphaServer 4000/4100 pedestal or cabinet system. It includes troubleshooting information, configuration rules, and instructions for removal and replacement of field-replaceable units (FRUs). Digital Equipment Corporation...
  • Page 2 The software, if any, described in this document is furnished under a license and may be used or copied only in accordance with the terms of such license. No responsibility is assumed for the use or reliability of software or equipment that is not supplied by Digital Equipment Corporation or its affiliated companies.
  • Page 3: Table Of Contents

    Contents Preface ....................... xi Chapter 1 System Overview AlphaServer 4100 System Drawer (BA30A)..........1-2 AlphaServer 4000 System Drawer (BA30C)..........1-4 AlphaServer 4100 System Drawer (BA30B)..........1-6 Cabinet System ..................1-8 Pedestal System..................1-10 Control Panel and Drives ...............1-12 System Consoles..................1-14 System Architecture ................1-16 System Motherboard ................1-18 1.10...
  • Page 4 Console Device Determination ............... 2-18 Console Power-Up Display..............2-20 2.10 Fail-Safe Loader..................2-24 Chapter 3 Troubleshooting Troubleshooting with LEDs..............3-2 3.1.1 Cabinet Power and Fan LEDs............3-4 Troubleshooting Power Problems .............3-6 3.2.1 Power Control Module LEDs.............3-8 Maintenance Bus (I C Bus) ..............3-10 Running Diagnostics —Test Command ..........
  • Page 5 5.4.1 System Bus ECC Error ..............5-39 5.4.2 System Bus Nonexistent Address Error..........5-40 5.4.3 System Bus Address Parity Error............. 5-41 5.4.4 PIO Buffer Overflow Error (PIO_OVFL)......... 5-42 5.4.5 Page Table Entry Invalid Error ............5-43 5.4.6 PCI Master Abort ................5-43 5.4.7 PCI System Error ................
  • Page 6 7.16 PCI Motherboard (B3051) Removal and Replacement......7-36 7.17 Server Control Module Removal and Replacement......... 7-38 7.18 PCI/EISA Option Removal and Replacement ......... 7-40 7.19 Power Supply Removal and Replacement..........7-42 7.20 Power Harness (4100 & early 4000) Removal and Replacement ..... 7-44 7.21 Power Harness (Later 4000) Removal and Replacement ......
  • Page 7 Appendix C Operating the System Remotely RCM Console Overview................C-1 C.1.1 Modem Usage .................. C-2 C.1.2 Entering and Leaving Command Mode ..........C-5 C.1.3 RCM Commands................C-6 C.1.4 Dial-Out Alerts................C-15 C.1.5 Resetting the RCM to Factory Defaults........... C-18 C.1.6 Troubleshooting Guide ..............
  • Page 8 Figures Components of the BA30A System Drawer..........1-2 Cover Interlock Circuit (BA30A) .............1-3 Components of the BA30C System Drawer ..........1-4 Cover Interlock Circuit (BA30C)..............1-5 Components of the BA30B System Drawer ..........1-6 Cover Interlock Circuit (BA30B)..............1-7 AlphaServer 4100 Cabinet System ............1-8 Cabinet Fan Tray ..................1-9 Pedestal System Front ................1-10 1-10 Pedestal System Rear................1-11...
  • Page 9 Pedestal Power Distribution (Europe and AP)......... 4-15 Error Detector Placement .................5-2 System Drawer FRU Locations..............7-2 Location of 4100 Power System FRUs............7-8 Location of 4000 Power System FRUs............ 7-10 Exposing System Drawer (H9A10-EB & -EC Cabinet)......7-12 Exposing System Drawer (H9A10-EL & -EM Cabinet) ......7-14 Exposing System Drawer (Pedestal) ............
  • Page 10 SROM Tests...................2-10 XSROM Tests..................2-13 Memory Tests ..................2-14 IOD Tests....................2-16 PCI Motherboard Tests................2-17 Power Control Module LED States............3-9 Types of Error Log Events................5-5 DECevent Report Formats..............5-10 CAP Error Register Data Pattern ............5-38 System Bus ECC Error Data Pattern............5-39 System Bus Nonexistent Address Error Troubleshooting ......
  • Page 11: Preface

    Chapter 3, Troubleshooting, describes troubleshooting during power-up and booting, as well as the test command. • Chapter 4, Power System, describes the AlphaServer 4000/4100 power system. • Chapter 5, Error Logs, explains how to interpret error logs and how to use DECevent.
  • Page 12 Appendix C, Operating the System Remotely, describes how to use the remote console monitor (RCM) to monitor and control the system remotely. Documentation Titles Table 1 lists titles related to AlphaServer 4000/4100 systems. Table 1 AlphaServer 4000/4100 Documentation Title Order Number QZ–00VAA–GZ...
  • Page 13 Information on the Internet Using a Web browser you can access the AlphaServer InfoCenter at: http://www.digital.com/info/alphaserver/products.html Access the latest system firmware either with a Web browser or via FTP as follows: ftp://ftp.digital.com/pub/Digital/Alpha/firmware/ Interim firmware released since the last firmware CD is located at: ftp://ftp.digital.com/pub/Digital/Alpha/firmware/interim/ xiii...
  • Page 15: Chapter 1 System Overview

    There are three system drawers; two, the BA30B and the BA30C, are used in the AlphaServer 4000, and the third, the BA30A, is used in the AlphaServer 4100. The pedestal system has one system drawer and up to three StorageWorks shelves.
  • Page 16: Alphaserver 4100 System Drawer (Ba30A)

    When the system drawer is in a pedestal, the control panel assembly is mounted in a tray at the top of the drawer. The numbered callouts in Figure 1-1 refer to components of the system drawer. AlphaServer 4000/4100 Service Manual...
  • Page 17 System card cage, which holds the system motherboard and the CPU, memory, bridge, and power control modules. (The difference between the BA30A and the BA30C is the system motherboard.) PCI/EISA card cage, which holds the PCI motherboard, option cards, and server control module.
  • Page 18: Alphaserver 4000 System Drawer (Ba30C)

    1.2 AlphaServer 4000 System Drawer (BA30C) Components in the BA30C system drawer are located in the system bus card cage, PCI card cage, control panel assembly, and power and cooling section. The drawer measures 30 cm x 45 cm (11.8 in. x 17.7 in.) and fully configured weighs approximately 45.5 kg (~100 lbs).
  • Page 19 System card cage, which holds the system motherboard and the CPU, memory, bridge, and power control modules. (The difference between the BA30A and the BA30C is the system motherboard.) PCI/EISA card cage, which holds the PCI motherboard, option cards, and server control module.
  • Page 20: Components Of The Ba30B System Drawer

    1.3 AlphaServer 4000 System Drawer (BA30B) Components in the BA30B system drawer are located in the system bus card cage, two PCI card cages, the control panel assembly, and the power and cooling section. The drawer measures 30 cm x 45 cm (11.8 in. x 17.7 in.) and fully configured weighs approximately 45.5 kg (~100 lbs).
  • Page 21: Power Control Module

    System card cage holds the system motherboard, the CPU, memory, bridge, and power control modules. PCI/EISA card cage holds the PCI/EISA motherboard for PCI/EISA 0 and PCI 1, option cards, and server control module. Server control module holds the I/O connectors and remote console. Control panel assembly holds the control panel, a floppy, and a CD-ROM.
  • Page 22: Cabinet System

    1.4 Cabinet System The AlphaServer 4000/4100 cabinet system can accommodate multiple systems in a single cabinet. There are four cabinet variations that can hold different system configurations. Diferences are in power distribution and drawer mounting; from the outside the cabinets look almost identical.
  • Page 23 Cabinet Differences Cabinet Power Mounting Destination H9A10-EB AC input box C channel North America power strips (max drawers: 4) Asia Pacific H9A10-EC AC input box C channel Europe power strips (max drawers: 4) H9A10-EL Two 120 volt Pull-out tray North America H7600-AA power (max drawers: 3) Asia Pacific...
  • Page 24: Pedestal System

    Gbytes of in-cabinet storage. Figure 1-9 Pedestal System Front PK-0301-96 In the pedestal system, the control panel is located at the top left in a tray. See Figure 1-11. There is space for an optional device beside it. 1-10 AlphaServer 4000/4100 Service Manual...
  • Page 25: Pedestal System Rear

    Figure 1-10 Pedestal System Rear PK-0307a-96 System Overview 1-11...
  • Page 26: Control Panel And Drives

    On/Off button. Powers the system drawer on or off. When the LED at the top of the button is lit, the power is on. The On/Off button is connected to the power supplies and the system interlocks. 1-12 AlphaServer 4000/4100 Service Manual...
  • Page 27 NOTE: The LEDs on some modules are on when the line cord is plugged in, regardless of the position of the On/Off button. Halt button. Pressing this button in (so the LED at the top of the button is on) does the following: If DIGITAL UNIX or OpenVMS is running, halts the operating system and returns to the SRM console.
  • Page 28: System Consoles

    AlphaBIOS console is invoked: AlphaBIOS Version 5.12 Please select the operating system to start: Windows NT Server 3.51 to move the highlight to your choice. Press Enter to choose. Alpha Press <F2> to enter SETUP PK-0728-96 1-14 AlphaServer 4000/4100 Service Manual...
  • Page 29 NT and the Halt button is out (not lit). Refer to Appendix B of this guide for a list of the environment variables used to configure AlphaServer 4000 and 4100 systems. Refer to the AlphaServer 4x00 System Drawer User’s Guide for information on setting environment variables.
  • Page 30: System Architecture

    PCI Slot PCI Slot PCI Slot PCI/EISA PCI Slot PCI Slot PCI Slot Slot PCI/EISA PCI Slot Slot PCI Slot PCI Slot PCI/EISA Slot 4000 PCI Motherboard only PCI Motherboard (right hand) (left hand) PKW0421E-96 1-16 AlphaServer 4000/4100 Service Manual...
  • Page 31 AlphaServer 4000/4100 systems use the Alpha chip for the CPU. The CPU, memory, and I/O bridge modules, one to PCI/EISA I/O buses and another (4000 only) to another pair of PCIs, are connected to the system bus motherboard. A fourth type of module, the power control module, also plugs into the system motherboard.
  • Page 32: System Motherboard

    The system motherboard is on the floor of the system card cage. It has slots for the CPU, memory, power control, and bridge modules. Figure 1-13 System Motherboard Module Locations 4100 Motherboard (54-23803-01) 4000 Motherboard (54-23803-02) 4000 Motherboard (54-23805-01) PKW0440J-96 1-18 AlphaServer 4000/4100 Service Manual...
  • Page 33 The system motherboard has the logic for the system bus. It is the backplane that holds the CPU, memory, bridge, and power control modules. Figure 1-13 shows diagrams of the three motherboards used in AlphaServer 4000/4100 systems. The module locations are designated by the callouts.
  • Page 34: Cpu Types

    1.10 CPU Types AlphaServer 4000 and 4100 systems can be configured with one of several CPU variants. Variants are differentiated by CPU speeds and the presence or absence of a backup data cache external to the Alpha microprocessor chip. Figure 1-14 CPU Module Layout...
  • Page 35 Chip Description Unit Description Instruction 8-byte cache, 4-way issue Execution 4-way execution; 2 integer units, 1 floating-point adder, 1 floating-point multiplier Memory Merge logic, 8-Kbyte write-through first-level data cache, 96-Kbyte write-back second-level data cache, bus interface unit CPU Variants Module Variant Clock Frequency Onboard Cache B3001-CA...
  • Page 36: Memory Modules

    The 4100 system drawer can hold up to four memory module pairs. The 4000 system drawer can hold up to two memory module pairs. Figure 1-15 Memory Module Layout Typical Synchronous Memory Typical EDO Memory PKW0423C-96 1-22 AlphaServer 4000/4100 Service Manual...
  • Page 37 Memory Variants Each memory option consists of two identical modules. Each 4100 drawer supports up to four memory options, for a total of 4 Gbytes of memory: 4000 drawers support half that. Memory modules are used only in pairs and are available in 128 Mbyte, 512 Mbyte, and 1 Gbyte sizes.
  • Page 38: Memory Addressing

    (2 B3020-DA - 128 Mbyte/mod) 1024 Mbyte Address hole Second pair address space 512 Mbyte 1/2 occupied (2 B3020-DA - 128 Mbyte/mod) 512 Mbyte First pair defines total address space always fully occupied (2 B3020-EA 256 Mbyte/mod) PKW0424-96 1-24 AlphaServer 4000/4100 Service Manual...
  • Page 39 The rules for addressing memory are as follows: 1. Address space is determined by the memory pair in slot MEM0. 2. Memory pairs need not be the same size. 3. The memory pair in slot MEM0 must be the largest of all memory pairs. Other memory pairs may be as large but none may be larger.
  • Page 40: System Bus

    CPU3 (4100) or IOD3 (4000) System Bus Control CPU2 (4100) or IOD2 (4000) CPU1 MC ADR <39:4> CPU0 CTRL MC DATA EV_ADR <127:0> EV_DATA System to PCI Bus PCI/EISA Bridge PCI/EISA0 IOD0 IOD1 PCI1 PKW0425-96 1-26 AlphaServer 4000/4100 Service Manual...
  • Page 41 IODn where n is the number of the PCI bus. The first bridge is designated IOD0 and IOD1. The AlphaServer 4000 system bus connects up to two CPUs, two pairs of memory modules, and two I/O bus bridge modules. The second bridge on the 4000 system bus is designated IOD2 and IOD3.
  • Page 42: System Bus To Pci Bus Bridge Module

    Figure 1-18 Bridge Module PCI Bus Control AD<31:0> Address Control Data A to B bus ECC & Data <63:0> MDP A Data A to B & B to A bus ECC & Data AD<63:32> <127:64> MDPB PKW0426r-96 1-28 AlphaServer 4000/4100 Service Manual...
  • Page 43 It monitors the data lines for ECC errors and the command/address lines for parity errors. NOTE: When errors are logged, the two bridge modules on the AlphaServer 4000 are differentiated in the error log by their engineering code names, the left hand horse and the right hand horse.
  • Page 44: Pci I/O Subsystem

    4 64-bit slots Interrupt EISA Logic Data XBUS BDATA XBUS Xceivers Xceivers Combo I/O: Flash Realtime Mouse/ I2C Bus NVRAM EISA: serial ports 3 32- parallel port Clock Keyboard Interface 8Kx8 floppy cntrl bit slots PKW0431R-96 1-30 AlphaServer 4000/4100 Service Manual...
  • Page 45: Pci Motherboard Slot Numbering

    1- Mbyte flash ROMs containing system firmware, and an 8-Kbyte NVRAM. The B3050-AB PCI motherboard, used only in the AlphaServer 4000, contains two four-slot 64-bit PCI buses.
  • Page 46: Server Control Module

    The server control module enables remote console connections to the system drawer. The module passes signals to COM ports 1 and 2, the keyboard, and the mouse to the standard I/O connectors. Figure 1-20 Server Control Module Standard I/O Remote Console Monitor PK-0702B-96 1-32 AlphaServer 4000/4100 Service Manual...
  • Page 47 The server control module has two sections: the remote console monitor (RCM) and the standard I/O. See Appendix C for information on controlling the system remotely. The remote console monitor connects to a modem through the modem port on the bulkhead.
  • Page 48: Power Control Module

    1.17 Power Control Module The power control module controls power sequencing and monitors power supply voltage, temperature, and fans. Figure 1-21 Power Control Module System Motherboard Power Control Module Slot PK-0710-96 1-34 AlphaServer 4000/4100 Service Manual...
  • Page 49 The power control module performs these functions: • Controls power sequencing. • Monitors the combined output of power supplies and shuts down power if it is not in range. • Monitors system temperature and shuts off power if it is out of range. •...
  • Page 50: Power Supply

    CPU modules and PCI card cages; a second or third can be added for redundancy. The power system is described in detail in Chapter 4. Figure 1-22 Location of Power Supply Power Supply 2 Power Supply 1 Power Supply 0 PK-0715-96 1-36 AlphaServer 4000/4100 Service Manual...
  • Page 51 An AlphaServer 4100 system with three or four CPUs requires two power supplies (three for redundancy). • An AlphaServer 4000 system with one or two CPUs and one PCI card cage requires one power supply (two for redundancy). • An AlphaServer 4000 system with one or two CPUs and two PCI card cages requires two power supplies (three for redundancy).
  • Page 53: Chapter 2 Power-Up

    Chapter 2 Power-Up This chapter describes system power-up testing and explains the power-up displays. The following topics are covered: • Control Panel • Power-Up Sequence • SROM Power-Up Test Flow • SROM Errors Reported • XSROM Power-Up Test Flow • XSROM Errors Reported •...
  • Page 54: Control Panel

    When the Halt button LED is lit and the On/Off button is on, the system should be running either the SRM console or Windows NT. If the Halt button is in, but the LED is off, the OCP, its cables, or the PCM is likely to be broken. AlphaServer 4000/4100 Service Manual...
  • Page 55: Control Panel Display

    Table 2–1 Control Panel Display Field Content Display Meaning CPU number P0–P3 CPU reporting status Status TEST Tests are executing FAIL Failure has been detected MCHK Machine check has occurred INTR Error interrupt has occurred Test number Suspected device CPU0–3 CPU module number MEM0–3 and Memory pair number and low...
  • Page 56: Power-Up Sequence

    XSROM code into the Alpha chip and jumps to it. XSROM. The XSROM, or extended SROM, contains back-up cache and memory tests, and a fail-safe loader. The XSROM code resides in sector 0 of FEPROM 0 on AlphaServer 4000/4100 Service Manual...
  • Page 57: Contents Of Feproms

    the XBUS. Sector 2 of FEPROM 0 contains a duplicate copy of the code and is used if sector 0 is bad. FEPROM. Two 1-Mbyte programmable ROMs are on the XBUS on PCI0. FEPROM 0 contains two copies of the XSROM, the OpenVMS and DIGITAL UNIX PALcode, and the SRM console and decompression code.
  • Page 58: Console Code Critical Path

    Slot EISA PCI/EISA PCI Slot PCI Motherboard Slot PCI/EISA PCI Slot Slot XBUS XBUS BDATA Xceivers Xceivers Combo I/O: Flash Real-Time Mouse/ I2C Bus NVRAM serial ports parallel port Clock Keyboard Interface 8Kx8 floppy cntrl PKW0431E-96 AlphaServer 4000/4100 Service Manual...
  • Page 59 There are two console programs: the SRM console and the AlphaBIOS console, as detailed in the AlphaServer 4100 System Drawer User’s Guide (EK–4100A–UG) and the AlphaServer 4000 System Drawer User’s Guide (EK–4000A–UG). By default, the SRM console is always loaded and I/O system tests are run under it before the system loads AlphaBIOS.
  • Page 60: Srom Power-Up Test Flow

    Determine Primary banks Fail Size IOD twice Check integrity of HANG XSROM Pass Fail Loopback on Load first 8K of each IOD XSROM into S-cache Pass Light IOD LEDs Jump to XSROM overlay in S-cache PKW0432-96 AlphaServer 4000/4100 Service Manual...
  • Page 61 The Alpha chip built-in self-test tests the I-cache at power-up and upon reset. Each CPU chip loads its SROM code into its I-cache and starts executing it. If the chip is partially functional, the SROM code continues to execute. However, if the chip cannot perform most of its functions, that CPU hangs and that CPU pass/fail LED remains off.
  • Page 62: Srom Tests

    S-cache parity error detection, AC_CTL register and test parity error forcing logic, SC_STAT register and reporting logic IOD Access test Access to IOD CSRs, data path through CAP chip and MDP0 on each IOD, PCI0 A/D lines <31:0> 2-10 AlphaServer 4000/4100 Service Manual...
  • Page 63: Srom Errors Reported

    2.4 SROM Errors Reported The SROM reports machine checks, pending interrupt/exception errors, and errors related to corruption of FEPROM 0. If SROM errors are fatal, the particular CPU will hang and only the CPU self-test pass LEDs and/or the LEDs on the system bus to PCI bus bridge module will indicate the failure.
  • Page 64: Xsrom Power-Up Test Flow

    Note: The XSROM can only print to the console device if the environment variable console = serial. It always sends PKW0432A-96 output to the OCP. XSROM tests are described in Table 2-3. Failure indicates a CPU failure. 2-12 AlphaServer 4000/4100 Service Manual...
  • Page 65: Xsrom Tests

    After jumping to the primary CPU’s S-cache, the code then intentionally I-caches itself and is completely register based (no D-stream for stack or data storage is used). The only D-stream accesses are writes/reads during testing. Each FEPROM has sixteen 64-Kbyte sectors. The first sector contains B-cache tests, memory tests, and a fail-safe loader.
  • Page 66 No new logic Maps out bad memory by Bitmap way of the bitmap. It does not Building completely fail memory. Memory No new logic Maps out bad memory. March test * There is no test 22. 2-14 AlphaServer 4000/4100 Service Manual...
  • Page 67: Xsrom Errors Reported

    2.6 XSROM Errors Reported The XSROM reports B-cache test errors and memory test errors. It also reports a warning if memory is illegally configured. Example 2-2 XSROM Errors Reported at Power-Up B-cache Error (CPU Error) TEST ERR on cpu0 #CPU running the test cpu0 err# tst#...
  • Page 68: Console Power-Up Tests

    PCI Loopback test Loops data through each PCI on each IOD, testing the mask field of the system bus. PCI Peer-to-Peer Tests that devices on the same PCI and on Byte Mask test different PCIs can communicate. 2-16 AlphaServer 4000/4100 Service Manual...
  • Page 69: Pci Motherboard Tests

    Table 2-6 PCI Motherboard Tests (B3050 only) Test Diagnostic Number Test Name Name Description PCEB pceb_diag Tests the PCI to EISA bridge chip esc_diag Tests the EISA system controller 8K NVRAM nvram_diag Tests the NVRAM Real-Time Clock ds1287_diag Tests the real-time clock chip Keyboard and i8242_diag Tests the keyboard/mouse chip...
  • Page 70: Console Device Determination

    VGA adapter as system is powering up console device. PCI0 Enable COM port 1 and send messages as system is powering up. Warning message sent if a VGA adapter is seen on PCI 1 PKW0434-96 2-18 AlphaServer 4000/4100 Service Manual...
  • Page 71 Console Device Options The console device can be either a serial terminal or a graphics monitor. Specifically: • A serial terminal connected to COM1 off the server control module. The terminal connected to COM1 must be set to 9600 baud. This baud rate cannot be changed.
  • Page 72: Console Power-Up Display

    BCache testing complete on cpu3 BCache testing complete on cpu1 mem_pair0 - 128 MB mem_pair1 - 128 MB 20..20..21..20..21..20..21..21..23..24..24..24..24.. Memory testing complete on cpu0 Memory testing complete on cpu1 Memory testing complete on cpu3 Memory testing complete on cpu2 2-20 AlphaServer 4000/4100 Service Manual...
  • Page 73 At power-up or reset, the SROM code on each CPU module is loaded into that module’s I-cache and tests the module. If all tests pass, the processor’s LED lights. If any test fails, the LED remains off and power-up testing terminates on that CPU.
  • Page 74 0 slot 2 - DECchip 21041-AA bus 0 slot 3 - NCR 53C810 bus 0 slot 4 - DECchip 21040-AA probing IOD0 hose 0 bus 0 slot 1 - PCEB Configuring I/O adapters... AlphaServer 4100 Console V1.0, 13-MAR-1996 18:18:26 P00>>> 2-22 AlphaServer 4000/4100 Service Manual...
  • Page 75 The final primary CPU determination is made. The primary CPU unloads PALcode and decompression code from the FEPROM on the PCI 0 to its B- cache. The primary CPU then jumps to the PALcode to start the SRM console. The primary CPU prints a message indicating that it is running the console. Starting with this message, the power-up display is printed to the default console terminal, regardless of the state of the console environment variable.
  • Page 76: Fail-Safe Loader

    The XSROM has completed its B-cache and memory tests but has failed to unload the PALcode in FEPROM 0 sector 1 or the SRM console code. • The XSROM reports the errors encountered and loads the fail-safe loader. 2-24 AlphaServer 4000/4100 Service Manual...
  • Page 77: Chapter 3 Troubleshooting

    Chapter 3 Troubleshooting This chapter describes troubleshooting during power-up and booting, as well as diagnostics for AlphaServer 4000/4100 systems. The following topics are covered: • Troubleshooting with LEDs • Troubleshooting Power Problems • Running Diagnostics—Test Command Troubleshooting...
  • Page 78: Troubleshooting With Leds

    CPU LEDs IOD0 Self-Test Pass DC_OK IOD1 Self-Test Pass SROM Oscillator CPU Self-Test Pass POWER_FAN_OK Regulator OK (EV56) TEMP_OK Bridge Module LEDs (IOD 2 & 3) Normally On Normally Off IOD2 Self-Test Pass IOD3 Self-Test Pass PKW0400C-97 AlphaServer 4000/4100 Service Manual...
  • Page 79 CPU LEDs • If the CPU STP LED on any CPU module is lit, that CPU chip is functioning properly. If the operating system is NT and the CPU STP LED is off, that CPU may or may not be functioning. You can use the Halt button on the OCP to prevent the AlphaBIOS console (which turns off the CPU STP LED) from booting, thus assuring the validity of the CPU STP LED.
  • Page 80: Cabinet Power And Fan Leds

    3.1.1 Cabinet Power and Fan LEDs Figure 3-2 Cabinet Power and Fan LEDs Fan LED Power LED PK-0664-96 AlphaServer 4000/4100 Service Manual...
  • Page 81 A cabinet system has three exhaust fans at the top of the cabinet. They are powered from a small power supply in the fan tray. This power supply also powers the server control module at the bottom of the PCI card cage to allow remote access to the system.
  • Page 82: Troubleshooting Power Problems

    PKW0436A-96 If Halt Is Caused by Power, Fan, or Overtemperature If a system is stopped because of a power, fan, or overtemperature problem, use the PCM LEDs to diagnose the problem. See Section 3.2.1. AlphaServer 4000/4100 Service Manual...
  • Page 83 If Power Problem Occurs at Power-Up If the system has a power problem on a cold start, the PCM LEDs are not valid until after DCOK_SENSE has been asserted. The cause is one of the following: • Broken system fan •...
  • Page 84: Power Control Module Leds

    LEDs. Figure 3-3 PCM LEDs DCOK_SENSE PS0_OK PS1_OK PS2_OK TEMP_OK CPUFAN_OK SYSFAN_OK CS_FAN0 CS_FAN1 CS_FAN2 C_FAN3 Normally On Tested at one-second intervals Off if power supply not present or broken PK-0714-96 AlphaServer 4000/4100 Service Manual...
  • Page 85: Power Control Module Led States

    Table 3-1 Power Control Module LED States State Description Both +5.0V and +3.43V are present and within limits. DCOK_SENSE Power supply 0 is present and has asserted POK_H. PS0_OK Power supply 1 is present and has asserted POK_H. PS1_0K Power supply 1 not present. Power supply 2 is present and has asserted POK_H.
  • Page 86: Maintenance Bus

    C controller accesses it. Everything written or read on the I C bus is done by the controller. The block diagram below notes differences between the AlphaServer 4000 and 4100 with respect to the I C bus. Figure 3-4 I...
  • Page 87 Monitor The I C bus monitors the state of system conditions scanned by the PCM. There are two registers on the PCM: • One records the state of the fans and power supplies and is latched when there is a fault. •...
  • Page 88: Running Diagnostics -Test Command

    Disables the display of status messages as exerciser processes are started and stopped during testing. option Either cpun, memn, or pcin, where n is 0, 1, 2, 3, or *. If nothing is specified, the entire system is tested. 3-12 AlphaServer 4000/4100 Service Manual...
  • Page 89: Testing An Entire System

    3.5 Testing an Entire System A test command with no modifiers runs all exercisers for subsystems and devices on the system. I/O devices tested are supported boot devices. The test runs for 10 minutes. Example 3-2 Sample Test Command P00>>> test Console is in diagnostic mode System test, runtime 600 seconds Type ^C to stop testing...
  • Page 90 1088289024 00003062 memtest memory 1041 1090385920 1090385920 00003084 memtest memory 467607808 467607808 000030d8 exer_kid dkb200.2.0.3 81488896 000030d9 exer_kid dkb400.4.0.3 81472512 0000310d exer_kid dva0.0.0.100 607232 Testing aborted. Shutting down tests. Please wait.. System test complete P00>>> 3-14 AlphaServer 4000/4100 Service Manual...
  • Page 91: Testing Memory

    3.5.1 Testing Memory The test mem command tests individual memory devices or all memory. The test shown in Example 3-3 runs for 2 minutes. Example 3-3 Sample Test Memory Command P00>>> test memory Console is in diagnostic mode System test, runtime 120 seconds Type ^C to stop testing Starting background memory test, affinity to all CPUs..
  • Page 92 937426944 937426944 000046e0 memtest memory 2346 2458610560 2458610560 000046e9 memtest memory 2337 2449174528 2449174528 000046f2 memtest memory 2333 2444980736 2444980736 000046fb memtest memory 932070272 932070272 Memory test complete Test time has expired... P00>>> 3-16 AlphaServer 4000/4100 Service Manual...
  • Page 93: Testing Pci

    3.5.2 Testing PCI The test pci command tests PCI buses and devices. The test runs for 2 minutes. Example 3-4 Sample Test Command for PCI P00>>> test pci* Console is in diagnostic mode System test, runtime 120 seconds Type ^C to stop testing Configuring all PCI buses..
  • Page 94 Pass Hard/Soft Bytes Written Bytes Read -------- ------------ ------------ ------ --------- ------------- ------------ 00002c29 exer_kid dkb200.2.0.3 48689152 00002c2a exer_kid dkb400.4.0.3 48689152 00002c5e exer_kid dva0.0.0.100 286720 Testing aborted. Shutting down tests. Please wait.. Testing complete P00>>> 3-18 AlphaServer 4000/4100 Service Manual...
  • Page 95: Chapter 4 Power System

    Chapter 4 Power System This chapter describes the AlphaServer 4000/4100 power system: • Power Supply • Power Control Module Features • Power Circuit and Cover Interlocks • Power-Up/Down Sequencing • Cabinet Power Configuration Rules • Pedestal Power Configuration Rules (North America and Japan) •...
  • Page 96: Power Supply

    4.1 Power Supply Power supply ouputs are shown in Figure 4-1. Figure 4-1 Power Supply Outputs Misc. Signal Current share +5V/Return +3.4V/Return +3.4V/Return +12V/Return PKW0402A-96 AlphaServer 4000/4100 Service Manual...
  • Page 97 Power Supply Features • 90–264 Vrms input • 450 watts output. Output voltages are as follows: Output Voltage Min. Voltage Max. Voltage Max. Current +5.0 4.85 5.25 +3.43 3.400 3.465 11.5 12.6 –12 –10.9 –13.2 –5.0 –4.6 –5.5 Vaux 0.05 •...
  • Page 98: Power Control Module Features

    4.2 Power Control Module Features The power control module (54-24117-01) is located behind the B3040-AA module, the system bus to PCI bus bridge module. Figure 4-2 Power Control Module System Motherboard Power Control Module Slot PK-0710-96 AlphaServer 4000/4100 Service Manual...
  • Page 99 The power control module performs the following functions: • Controls the power-up/down sequencing. • Monitors the combined output of power supplies VDD (3.43V) and VCC (5.0V) and asserts DCOK_SENSE if these voltages are within range and asserts POWER_FAULT_L causing an immediate power shutdown if either is not. •...
  • Page 100: Power Circuit And Cover Interlocks

    4.3 Power Circuit and Cover Interlocks Figure 4-3 Power Circuit Diagram 17-04217-01 Logic Power Supply Cover Interlocks 17-04201-01 70-32016-01 (4100 & early 4000) 17-04302-01 Motherboard 70-33002-01 (4000 only) DC_ENABLE_L Switch 17- 419 6 - 1 POWER_FAULT_L 17-04201-02 RSM_DC_EN_L PKW0403F-06 AlphaServer 4000/4100 Service Manual...
  • Page 101 Figure 4-3 shows the distribution of power thoughout the system drawer. Opens in the circuit or the PCM signal POWER_FAULT_L or the SCM signal RSM_DC_EN_L interrupt DC power applied to the system. The opens can be caused by the On/Off button or the cover interlocks. The POWER_FAULT_L signal is asserted by the PCM module if it detects a fault and the RSM_DC_EN_L is controlled remotely.
  • Page 102: Power-Up/Down Sequence

    Figure 4-4 Power Up/Down Sequence Flowchart Apply AC Power Vaux on On-Off Button Assert DC_ENABLE_L Power Supply Starts 10 Second Delay 12 Second Deassert Delay Halt Faults DC_ENABLE_L Deassert Assert DC_ENABLE_L DCOK_SENSE DCOK_SENSE Voltages On-Off Button 30 Second Fan/Temp Delay PKW-0402-95 AlphaServer 4000/4100 Service Manual...
  • Page 103 When AC is applied to the system, Vaux (auxiliary voltage) is asserted and is sensed by the PCM. The PCM asserts DC_ENABLE_L starting the power supplies. If there is a hard fault on power-up, the power supplies shut down immediately; otherwise, the power system powers up and remains up until the system is shut off or the PCM senses a fault.
  • Page 104: Cabinet Power Configuration Rules

    0.38 Arms 0.38 Arms 1.83 Arms 1.83 Arms 1.83 Arms System Drawer Tray 0.5Arms 1.83 Arms 1.83 Arms 1.83 Arms AC Distribution Box 6.0 Arms 200 - 240 Vrms 6.3 Arms 14.5 Arms 2.3 Arms PKW0406-95 4-10 AlphaServer 4000/4100 Service Manual...
  • Page 105: Worst-Case -Eb & -Ec Cabinet Power Configuration

    Figure 4-6 Worst-Case -EB & -EC Cabinet Power Configuration StorageWorks StorageWorks Power Strips System Drawer System Drawer 0.38 Arms 0.38 Arms 1.83 Arms 1.83 Arms 1.83 Arms 1.83 Arms 0.38 Arms 0.38 Arms System Drawer 1.83 Arms 1.83 Arms 1.83 Arms 1.83 Arms System Drawer Tray...
  • Page 106: El & -Em Single Drawer Cabinet Power Configuration

    StorageWorks StorageWorks 0.38 0.38 Ams 0.38 Ams StorageWorks StorageWorks 0.38 Ams 0.38 Ams 0.38 Ams 0.38 Ams 0.38 Ams 240 V, 16 AMP Controller with 12 IEC C13 outlets (Europe & A.P.) 240 VMS PKW0406E-97 4-12 AlphaServer 4000/4100 Service Manual...
  • Page 107 Figure 4-8 -EL Three Drawer Cabinet Power Configuration (Three drawer -EL shown with H7600-AA controller) 2 Power System Drawer System Drawer Controllers 3.67 Ams 3.67 Ams 3.67 Ams 3.67 Ams StorageWorks 3.67 Ams System Drawer 0.75 Ams 0.75 Ams 1.0 Ams 3.67 Ams 3.67 Ams 3.67 Ams...
  • Page 108: Pedestal Power Configuration Rules (North America And Japan)

    Single AC power strip supports one system drawer and one StorageWorks shelf. When two AC power strips are used, combined AC input line current cannot exceed the site circuit breaker restriction, assuming both strips are plugged in to the same circuit. 4-14 AlphaServer 4000/4100 Service Manual...
  • Page 109: Pedestal Power Configuration Rules (Europe And Asia Pacific)

    Pedestal Power Configuration Rules (Europe and Asia Pacific) Figure 4-10 Pedestal Power Distribution (Europe and AP) Power Strips StorageWorks StorageWorks 0.34 Arms 0.34 Arms 0.34 Arms System Drawer 0.34 Arms 1.67 Arms 1.67 Arms 1.67 Arms 200 - 240 Vrms 5.0 Arms 200 - 240 Vrms 3.0 Arms...
  • Page 111: Chapter 5 Error Logs

    Chapter 5 Error Logs This chapter provides information on troubleshooting with error logs. The following topics are covered: • Using Error Logs • Using DECevent • Error Log Examples and Analysis • Troubleshooting IOD-Detected Errors • Double Error Halts and Machine Checks While in PAL Mode Error registers are described in Chapter 6.
  • Page 112: Using Error Logs

    CPU Module System Bus Sys/PCI Bus Bridge Data CPU Chip System Bus Comd/add B-cache EISA Bus Bridge Tag & Status Data Duplicate Tag EISA Tag & Status Parity stored Parity logic ECC stored ECC logic PKW0450-96 AlphaServer 4000/4100 Service Manual...
  • Page 113 Lines Protected Device ECC Protected System bus data lines IOD on every transaction, CPU when using the bus B-cache IOD on every transaction, CPU when using the bus Parity Protected System bus command/address lines IOD on every transaction, CPU when using the bus Duplicate tag store IOD on every transaction, CPU when using the bus...
  • Page 114: Hard Errors

    System-independent errors detected and corrected by the CPU. These errors are CPU module correctable errors handled as MCHK 630 interrupts. • System-dependent errors that are correctable single-bit errors on the system bus and are handled as MCHK 620 interrupts. AlphaServer 4000/4100 Service Manual...
  • Page 115: Error Log Events

    5.1.3 Error Log Events Several different events are logged by OpenVMS and DIGITAL UNIX. Windows NT does not log errors in this fashion. Table 5-1 Types of Error Log Events Error Log Event Description MCHK 670 Processor machine checks.These are synchronous errors that inform precisely what happened at the time the error occurred.
  • Page 116: Using Decevent

    To access on-line help: OpenVMS $ HELP DIAGNOSE $ DIA /INTERACTIVE DIA> HELP DIGITAL UNIX > man dia > dia hlp Privileges necessary to use DECevent: • SYSPRV for the utility • DIAGNOSE to use the /CONTINUOUS qualifier AlphaServer 4000/4100 Service Manual...
  • Page 117: Translating Event Files

    5.2.1 Translating Event Files To produce a translated event report using the default event log file, SYS$ERRORLOG:ERRLOG.SYS, enter the following command: OpenVMS $ DIAGNOSE DIGITAL UNIX > dia -a The DIAGNOSE command allows DECevent to use built-in defaults. This command produces a full report, directed to the terminal screen, from the input event file, SYS$ERRORLOG:ERRLOG.SYS.
  • Page 118: Filtering Events

    The commands shown here create output using only the entries for RZ disks, RA92 disks, and CPUs. The /EXCLUDE qualifier is used to create output for all devices except those named in the command. OpenVMS $ DIAGNOSE/TRANSLATE/EXCLUDE=(MEMORY) DIGITAL UNIX > dia -x mem AlphaServer 4000/4100 Service Manual...
  • Page 119 Use the /BEFORE and /SINCE qualifiers to select events before or after a certain date and time. OpenVMS $ DIAGNOSE/TRANSLATE/BEFORE=15-JAN-1996:10:30:00 $ DIAGNOSE/TRANSLATE/SINCE=15-JAN-1996:10:30:00 DIGITAL UNIX > dia -t s:15-jan-1996 e:20-jan-1996 If no time is specified, the default time is 00:00:00, and all events for that day are selected.
  • Page 120: Selecting Alternative Reports

    ASCII messages in a condensed format /Summary Produces a statistical summary of the events in the log /Fsterr Produces a one-line-per-entry report for disk and tape devices The syntax is: OpenVMS $ DIAGNOSE/TRANSLATE/<format> DIGITAL UNIX > dia -o <format> 5-10 AlphaServer 4000/4100 Service Manual...
  • Page 121: Error Log Examples And Analysis

    5.3 Error Log Examples and Analysis The following sections provide examples and analysis of error logs. 5.3.1 MCHK 670 CPU-Detected Failure The error log in Example 5-1 shows the following: CPU1 logged the error in a system with two CPUs. During a D-ref fill, the External Interface Status Register logged an uncorrectable EEC error.
  • Page 122 Base addr for palcode = x0000000008 Interrupt Summary Reg x00000000 AST requests 3 - 0 x00000000 IBOX Ctrl and Status Reg x000000C160000000 Timeout Bit Not Set PAL Shadow Registers Enabled Correctable Err Intrpts Enabled ICACHE BIST Successful 5-12 AlphaServer 4000/4100 Service Manual...
  • Page 123 TEST_STATUS_H Pin Asserted Icache Par Err Stat Reg x00000000 Dcache Par Err Stat Reg x00000000 Virtual Address Reg xFFFFFFFE8F63BD38 Memory Mgmt Flt Sts Reg x000000000166D1 Ref which caused err was a write Ref resulted in DTB miss RA Field x0000000000001B Opcode Field x0000000000002C Scache Address Reg...
  • Page 124: Cap Error Register

    Cycle 1 ECC Syndrome x00000000 Cycle 2 ECC Syndrome x00000000 Cycle 3 ECC Syndrome x00000000 MDPB Status Register x00000000 MDPB Chip Revision x00000000 MDPB Error Syndrome Reg x00000000 Cycle 0 ECC Syndrome x00000000 Cycle 1 ECC Syndrome x00000000 5-14 AlphaServer 4000/4100 Service Manual...
  • Page 125 Cycle 2 ECC Syndrome x00000000 Cycle 3 ECC Syndrome x00000000 PALcode Revision Palcode Rev: 1.21-3 Error Logs 5-15...
  • Page 126: Mchk 670 Cpu And Iod Detected Failure

    PCI bus bridge module, the B3040 module. The “Saddle” module is the PCI motherboard, the B3050 module. The “MC” bus is the system bus. Refer to Table 5-9 for information on decoding commands, and refer to Table 5-10 for information on node IDs. 5-16 AlphaServer 4000/4100 Service Manual...
  • Page 127 Example 5-2 MCHK 670 CPU and IOD-Detected Failure Logging OS 2. DIGITAL UNIX System Architecture 2. Alpha Event sequence number Timestamp of occurrence 08-APR-1996 11:27:55 Host name whip16 System type register x00000016 AlphaStation 4x00 Number of CPUs (mpnum) x00000004 CPU logging event (mperr) x00000003 Event validity 1.
  • Page 128 IO Host Addr Extension x00000000 Interrupt Control x00000003 MC-PCI Intr Enabled Device intr info enabled if en_int Interrupt Request x00810000 Interrupts asserted x00010000 Hard Error Interrupt Mask Register 0 x00C50010 Interrupt Mask Register 1 x00000000 5-18 AlphaServer 4000/4100 Service Manual...
  • Page 129 MC Error Info Register 0 x28681A80 MC bus trans addr <31:4> x028681A8 MC Error Info Register 1 x800FD800 MC bus trans addr <39:32> x00000000 MC_Command x00000018 Device Id x0000003F MC error info valid CAP Error Register xC0000000 Uncorrectable ECC err det by MDPB MC error info latched PCI Bus Trans Error Adr x000003FD...
  • Page 130 MPDB Error Syndrome of uncorrectable read error MDPB Error Syndrome Reg x0000004B Cycle 0 ECC Syndrome x0000000000004B Cycle 1 ECC Syndrome x00000000 Cycle 2 ECC Syndrome x00000000 Cycle 3 ECC Syndrome x00000000 PALcode Revision Palcode Rev: 1.21-3 5-20 AlphaServer 4000/4100 Service Manual...
  • Page 131: Mchk 670 Read Dirty Cpu Detected Failure

    5.3.3 MCHK 670 Read Dirty CPU-Detected Failure The error log in Example 5-3 shows the following: CPU0 logged the error in a system with two CPUs. The External Interface Status Register records an uncorrectable ECC error from the system (bit <30> set). Both IOD CAP Error Registers logged an error.
  • Page 132 PAL Base Address Reg x0000000000020000 Base Addr for PALcode: x0000000000000008 Interrupt Summary Reg x0000000000200000 External HW Interrupt at IPL21 AST Requests 3-0: x0000000000000000 IBOX Ctrl and Status Reg x000000C160000000 Timeout Counter Bit Clear. IBOX Timeout Counter Enabled. 5-22 AlphaServer 4000/4100 Service Manual...
  • Page 133 Floating Point Instructions will cause FEN Exceptions. PAL Shadow Registers Enabled. Correctable Error Interrupts Enabled. ICACHE BIST (Self Test) Was Successful. TEST_STATUS_H Pin Asserted Icache Par Err Stat Reg x0000000000000000 Dcache Par Err Stat Reg x0000000000000000 Virtual Address Reg x0000000000044000 Memory Mgmt Flt Sts Reg x0000000000005D10 If Err, Reference Resulted in DTB...
  • Page 134 Bcache Size = 2MB Base Address of Bridge x000000FBE0000000 Dev Type & Rev Register x06000021 CAP Chip Revision: x00000001 HORSE Module Revision: x00000002 SADDLE Module Revision: x00000000 SADDLE Module Type: LeftHand Internal CAP Chip Arbiter: Enabled 5-24 AlphaServer 4000/4100 Service Manual...
  • Page 135 PCI Class Code x00000600 MC-PCI Command Register x06480FF1 Module SelfTest Passed LED on Delayed PCI Bus Reads Protocol: Enabled Bridge to PCI Transactions: Enabled Bridge REQUESTS 64 Bit Data Transactions Bridge ACCEPTS 64 Bit Data Transactions PCI Address Parity Check: Enabled MC Bus CMD/Addr Parity Check: Enabled MC Bus NXM Check: Enabled...
  • Page 136: Mchk 660 Iod-Detected Failure

    PCI bus bridge module, the B3040 module. The “Saddle” module is the PCI motherboard, the B3050 module. The “MC” bus is the system bus. Refer to Table 5-9 for information on decoding commands, and refer to Table 5-10 for information on node IDs. 5-26 AlphaServer 4000/4100 Service Manual...
  • Page 137 Example 5-4 MCHK 660 IOD-Detected Failure (System Bus Error) Logging OS 2. DIGITAL UNIX System Architecture 2. Alpha Event sequence number Timestamp of occurrence 04-APR-1996 17:20:04 Host name whip16 System type register x00000016 AlphaStation 4x00 Number of CPUs (mpnum) x00000002 CPU logging event (mperr) x00000000 Event validity 1.
  • Page 138 MC Error Info Register 1 x800ED600 MC bus trans addr <39:32> x00000000 MC_Command x00000016 Device Id x0000003B MC error info valid CAP Error Register xA0000000 Uncorrectable ECC err det by MDPA MC error info latched 5-28 AlphaServer 4000/4100 Service Manual...
  • Page 139 PCI Bus Trans Error Adr x00000000 MDPA Status Register x80000000 MDPA Chip Revision x00000000 MDPA Error Syndrome of uncorrectable read error MDPA Error Syndrome Reg x1E00001E Cycle 0 ECC Syndrome x0000000000001E Cycle 1 ECC Syndrome x00000000 Cycle 2 ECC Syndrome x00000000 Cycle 3 ECC Syndrome x0000000000001E MDPB Status Register...
  • Page 140 MDPB Chip Revision x00000000 MDPB Error Syndrome Reg x00000000 Cycle 0 ECC Syndrome x00000000 Cycle 1 ECC Syndrome x00000000 Cycle 2 ECC Syndrome x00000000 Cycle 3 ECC Syndrome x00000000 PALcode Revision Palcode Rev: 1.21-3 5-30 AlphaServer 4000/4100 Service Manual...
  • Page 141 5.3.5 MCHK 660 IOD-Detected Failure (PCI Error) The error log in Example 5-5 shows the following: CPU 0 logged the error in a system with three CPUs. The External Interface Status register records that the error occurred during a D-ref Fill but does not indicate what the error is. The CAP Error register for IOD0 did not see an error.
  • Page 142 PALTEMP4 x0000000000000003 PALTEMP5 x0000000000000000 PALTEMP6 x000000000001C6AF PALTEMP7 xFFFFFC000043C820 PALTEMP8 x1F1E171515020100 PALTEMP9 xFFFFFC000043CB10 PALTEMP10 xFFFFFC0000433E0C PALTEMP11 xFFFFFC000043C970 PALTEMP12 xFFFFFC000043CD10 PALTEMP13 x0000000000026E80 PALTEMP14 x0000000000000000 PALTEMP15 x00000000000E0000 PALTEMP16 x0000020306600001 PALTEMP17 x0000000000000000 PALTEMP18 x0000000000000000 PALTEMP19 xFFFFFFFFB589F958 PALTEMP20 x00000000009D2000 5-32 AlphaServer 4000/4100 Service Manual...
  • Page 143 PALTEMP21 xFFFFFC000043CD40 PALTEMP22 xFFFFFC000058D540 PALTEMP23 x000000007FC67A58 Exception Address Reg xFFFFFC0000433E0C Native-mode Instruction Exception PC x3FFFFF000010CF83 Exception Summary Reg x0000000000000000 Exception Mask Reg x0000000000000000 PAL Base Address Reg x0000000000020000 Base Addr for PALcode: x0000000000000008 Interrupt Summary Reg x0000000000200000 External HW Interrupt at IPL21 AST Requests 3-0: x0000000000000000 IBOX Ctrl and Status Reg...
  • Page 144 Bridge ACCEPTS 64 Bit Data Transactions PCI Address Parity Check: Enabled MC Bus CMD/Addr Parity Check: Enabled MC Bus NXM Check: Enabled Check ALL Transactions for Errors Use MC_BMSK for 16 Byte Align Blk Mem Wrt Wrt PEND_NUM Threshold: 5-34 AlphaServer 4000/4100 Service Manual...
  • Page 145 RD_TYPE Memory Prefetch Algorithm: Short RL_TYPE Mem Rd Line Prefetch Type: Medium RM_TYPE Mem Rd Multiple Cmd Type: Long ARB_MODE Arbitration: MC-PCI Priority Mode Mem Host Address Ext Reg x00000000 HAE Sparse Mem Adr<31:27> x00000000 IO Host Adr Ext Register x00000000 PCI Upper Adr Bits<31:25>...
  • Page 146 Parity Error Detection Response: Normal Wait Cycle Address/Data Stepping: DISABLED SERR# Sys Err Driver Capability: Enabled Fast Back-to-Back to Many Target: DISABLED Status Register x0200 Device is 33 Mhz Capable. No Support for User Defineable Features. 5-36 AlphaServer 4000/4100 Service Manual...
  • Page 147 Fast Back-to-Back to Different Targets, Is Not Supported in Target Device. Device Select Timing: Medium. Revision ID Device Class Code x010000 Mass Storage: SCSI Bus Controller Cache Line S Latency T. Header Type Single Function Device Bist Base Address Register 1 x00101100 Base Address Register 2 x01119100...
  • Page 148 Base Address Register 2 x00000000 Base Address Register 3 x00100000 Base Address Register 4 x01000000 Base Address Register 5 x00000000 Base Address Register 6 x00000000 Expansion Rom Base Addres x01100000 Interrupt P1 Interrupt P2 Min Gnt Max Lat 5-38 AlphaServer 4000/4100 Service Manual...
  • Page 149: Mchk 630 Correctable Cpu Error

    5.3.6 MCHK 630 Correctable CPU Error The error log in Example 5-6 shows the following: CPU0 logged the error in a system with two CPUs. During a D-ref fill, the External Interface Status Register shows no error but states that the “data source is b-cache. ” (When a CPU chip does not find data it needs to perform a task in any of its caches, it requests data from off the chip to fill its D-cache.
  • Page 150 MDPA Error Syndrome Reg x00000000 MDPA Syndrome Register Data Not Valid MDPB Status Register x00000000 MDPB Status Register Data Not Valid MDPB Error Syndrome Reg x00000000 MDPB Syndrome Register Data Not Valid PALcode Revision Palcode Rev: 1.21-3 5-40 AlphaServer 4000/4100 Service Manual...
  • Page 151 Error Logs 5-41...
  • Page 152: Mchk 620 Correctable Error

    5. Low Priority Entry type 100. CPU Machine Check Errors CPU Minor class 4. 620 System Correctable Error Software Flags x0000000000000000 Active CPUs x00000003 Hardware Rev x00000000 System Serial Number C1563 Module Serial Number Module Type x0000 5-42 AlphaServer 4000/4100 Service Manual...
  • Page 153 System Revision x00000000 Machine Check Reason x0204 IOD Detected Soft Error Ext Interface Status Reg x0000000000000000 Not Valid for 620 System Correctable Errors Ext Interface Address Reg x0000000000000000 Not Valid for 620 System Correctable Errors Fill Syndrome Reg x0000000000000000 Not Valid for 620 System Correctable Errors Interrupt Summary Reg x0000000000000000...
  • Page 154: Troubleshooting Iod-Detected Errors

    Go to Step 7 0000 0000 0000 0000 0000 0000 0001 xx1x SERR - PCI system error Go to Step 8 0000 0000 0000 0000 0000 0000 0001 xxx1 PERR - PCI parity error Go to Step 9 5-44 AlphaServer 4000/4100 Service Manual...
  • Page 155: System Bus Ecc Error

    5.4.1 System Bus ECC Error Step 2 Read the MC_ERR1 register and match the contents with the data pattern. Perform the action indicated. Table 5-4 System Bus ECC Error Data Pattern MC_ERR1 Data Pattern Most Likely Cause Action for Memory Read 1000 0000 0000 xxxx xxxx 10xx 0xxx xxxx Bad nondirty data from Go to Step 10...
  • Page 156: System Bus Nonexistent Address Error

    1000 0000 0000 xxxx xxxx xxxx 1xxx 110x PCI2 bridge did not Replace IOD1 respond 1000 0000 0000 xxxx xxxx xxxx 1xxx 111x PCI3 bridge did not Replace IOD1 respond NOTE: IOD0 = B3040-AA bridge module; IOD1 = B3040-AB bridge module. 5-46 AlphaServer 4000/4100 Service Manual...
  • Page 157: System Bus Address Parity Error

    5.4.3 System Bus Address Parity Error Step 4 Determine which node put the bad command/adress on the system bus identified in MC_ERR1. Perform the action indicated. Table 5-6 Address Parity Error Troubleshooting MC_ERR1 Data Pattern Most Likely Cause Action 1000 0000 000x xxx0 10xx xxxx xxxx xxxx Data sourced by MID = 2 Replace CPU0 1000 0000 000x xxx0 11xx xxxx xxxx xxxx...
  • Page 158 Broken hardware on IOD Replace IOD Expected_PEND_NUM Actual_PEND_NUM < Broken hardware on IOD Replace IOD Expected_PEND_NUM Actual_PEND_NUM > PEND_NUM setup incorrect Fix the software Expected_PEND_NUM NOTE: IOD0 = B3040-AA bridge module; IOD1 = B3040-AB bridge module. 5-48 AlphaServer 4000/4100 Service Manual...
  • Page 159: Page Table Entry Invalid Error

    5.4.5 Page Table Entry Invalid Error Step 6 This error is almost always a software problem. However, if the software is known to be good and the hardware is suspected, swap the IOD. 5.4.6 PCI Master Abort Step 7 Master aborts normally occur when the operating system is sizing the PCI bus. However, if the master abort occurs after the system is booted, read PCI_ERR1 and determine which PCI device should have responded to this PCI address.
  • Page 160: Broken Memory

    P00>>> show mem 2. Compare this address to the failing address from the MC_ERR1 and MC_ERR0 Registers to determine which memory slot is failing. 5-50 AlphaServer 4000/4100 Service Manual...
  • Page 161: Ecc Syndrome Bits Table

    3. When you have isolated the failing memory pair, determine which of the two modules is bad. (You cannot do this if the operating system is Windows NT.) Read the CPU FIL SYNDROME Register. If this register is non-zero, use the ECC syndrome bits in Table 5-8 to determine which module had the single-bit error.
  • Page 162: Command Codes

    1 0 0 0 Read0 - Mem 1 0 0 0 Read0 - I/O 1 0 0 1 Read1 - Mem 1 0 0 1 Read1 - I/O 1 0 1 0 Read Mod0 - 5-52 AlphaServer 4000/4100 Service Manual...
  • Page 163: Node Ids

    Table 5-9 Decoding Commands (continued) MC_C No B- Cache Cache 3 2 1 0 <39> Description 1 0 1 0 Read Mod0 - 1 0 1 0 Read Peer0 - I/O 1 0 1 1 Read Mod1 - 1 0 1 1 Read Peer1 - I/O 1 1 0 0 FILL0 (due to...
  • Page 164: Double Error Halts And Machine Checks While In Pal Mode

    Instructions that require VAX-style interlocked memory access • Privileged instructions • Memory management • Context swapping • Interrupt and exception dispatching • Power-up initialization and booting • Console functions • Emulation of instructions with no hardware support 5-54 AlphaServer 4000/4100 Service Manual...
  • Page 165: Double Error Halt

    5.5.2 Double Error Halt A double error halt occurs under the following conditions: • A machine check occurs. • PAL completes its tasks and returns control of the system to the operating system. • A second machine check occurs before the operating system completes its tasks. The machine returns to the console and displays the following message: halt code = 6 double error halt...
  • Page 166 00000000 : 03e4 cns$sc_addr 000047cf : 03e8 cns$sc_addr+4 ffffff00 : 03ec cns$sc_ctl 0000f000 : 03f0 cns$sc_ctl+4 00000000 : 03f4 cns$bc_tag_addr ff7fefff : 03f8 cns$bc_tag_addr+4 ffffffff : 03fc cns$ei_stat 04ffffff : 0400 cns$ei_stat+4 fffffff0 : 0404 5-56 AlphaServer 4000/4100 Service Manual...
  • Page 167 cns$fill_syn 000000a7 : 0410 cns$fill_syn+4 00000000 : 0414 cns$ld_lock 0004eaef : 0418 cns$ld_lock+4 ffffff00 : 041c Error Logs 5-57...
  • Page 168 0001904f : 0168 mchk$sc_addr+4 ffffff00 : 016c mchk$sc_stat 00000000 : 0170 mchk$sc_stat+4 00000000 : 0174 mchk$bc_tag_addr ff7fefff : 0178 mchk$bc_tag_addr+4 ffffffff : 017c mchk$ei_addr 066bc3ef : 0180 mchk$ei_addr+4 ffffff00 : 0184 mchk$fill_syn 000000a7 : 0188 5-58 AlphaServer 4000/4100 Service Manual...
  • Page 169 mchk$fill_syn+4 00000000 : 018c mchk$ei_stat 04ffffff : 0190 mchk$ei_stat+4 fffffff0 : 0194 mchk$ld_lock 00005b6f : 0198 mchk$ld_lock+4 ffffff00 : 019c IOD: 0 base address: f9e0000000 WHOAMI: 0000003a PCI_REV: 06008221 CAP_CTL: 02490fb1 HAE_MEM: 00000000 HAE_IO: 00000000 INT_CTL: 00000003 INT_REQ: 00800000 INT_MASK0: 00010000 INT_MASK1: 00000000...
  • Page 170 10000000 DIAG_CHKB: 10000000 SCRATCH: 00000000 W0_BASE: 00100001 W0_MASK: 00000000 T0_BASE: 00001000 W1_BASE: 00800001 W1_MASK: 00700000 T1_BASE: 00008000 W2_BASE: 80000001 W2_MASK: 3ff00000 T2_BASE: 00000000 W3_BASE: 00000000 W3_MASK: 1ff00000 T3_BASE: 0000a000 W_DAC: 00000000 SG_TBIA: 00000000 HBASE: 00000000 5-60 AlphaServer 4000/4100 Service Manual...
  • Page 171: External Interface Status Register

    Chapter 6 Error Registers This chapter describes the registers used to hold error information. These registers include: • External Interface Status Register • External Interface Address Register • MC Error Information Register 0 • MC Error Information Register 1 • CAP Error Register •...
  • Page 172 EI_STAT register is not unlocked or cleared by reset. Address FF FFF0 0168 Type 3130 29 28 27 24 23 All 1s CHIP_ID <3:0> BC_TPERR BC_TC_PERR EI_ES COR_ECC_ERR 35 34 33 32 All 1s SEO_HRD_ERR FIL_IRD EI_PAR_ERR UNC_ECC_ERR PKW0453-96 AlphaServer 4000/4100 Service Manual...
  • Page 173 Fill data from B-cache or main memory could have correctable or uncorrectable errors in ECC mode. In parity mode, fill data parity errors are treated as uncorrectable hard errors. System address/command parity errors are always treated as uncorrectable hard errors, irrespective of the mode. The sequence for reading, unlocking, and clearing EI_STAT, EI_ADDR, BC_TAG_ADDR, and FILL_SYN is as follows: 1.
  • Page 174 B-Cache Tag Address Parity Error. Indicates that a B-cache read transaction encountered bad parity in the tag address RAM. CHIP_ID <27:24> Chip Identification. Read as “4.” Future update revisions to the chip will return new unique values. <23:0> All ones. AlphaServer 4000/4100 Service Manual...
  • Page 175 Table 6-1 External Interface Status Register (continued) Name Bits Type Description <63:36> All ones. SEO_HRD_ERR <35> Second External Interface Hard Error. Indicates that a fill from B-cache or main memory, or a system address/command received by the CPU has a hard error while one of the hard error bits in the EI_STST register is already set.
  • Page 176 EI_STAT register. It is unlocked by a read of the EI_STAT Register. This register is meaningful only when one of the error bits is set. Address FF FFF0 0148 Access All 1s 40 39 EI_ADDR All 1s <39:32> PKW0454-96 AlphaServer 4000/4100 Service Manual...
  • Page 177: Chapter 6 Error Registers

    Table 6-2 Loading and Locking Rules for External Interface Registers Correct Uncorrect- Second -able able Error Hard Load Lock Action When Error Error Register Register EI_STAT Is Read Clears and unlocks possible all registers Clears and unlocks possible all registers Clears and unlocks all registers Clear bit (c) does...
  • Page 178: Mc Error Information Register 0

    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 Failing Address ADDR<31:04> Table 6-3 MC Error Information Register 0 Initial Name Bits Type State Description ADDR<31:4> <31:4> Contains the address of the transaction on the system bus when an error is detected. Reserved <3:0> AlphaServer 4000/4100 Service Manual...
  • Page 179: Mc Error Information Register 1

    6.1.3 MC Error Information Register 1 (MC_ERR1 - Offset = 840) The high-order MC bus (system bus) address bits and error symptoms are latched into this register when the system bus to PCI bus bridge detects an error. If the event is a hard error, the register bits are locked. A write to clear symptom bits in the CAP Error Register unlocks this register.
  • Page 180 MC_CMD<5:0> <13:8> Active command at the time the error was detected. ADDR<39:32> <7:0> Address bits <39:32> of the transaction on the system bus when an error is detected. 6-10 AlphaServer 4000/4100 Service Manual...
  • Page 181: Cap Error Register

    6.1.4 CAP Error Register (CAP_ERR - Offset = 880) CAP_ERR is used to log information pertaining to an error detected by the CAP or MDP ASIC. If the error is a hard error, the register is locked. All bits, except the LOST_MC_ERR bit, are locked on hard errors.
  • Page 182 CPU will also get a fill error on reads. MC_ADR_PERR <25> RW1C Set when a system bus command/address parity error is detected. 6-12 AlphaServer 4000/4100 Service Manual...
  • Page 183 Table 6-5 CAP Error Register (continued) Initial Name Bits Type State Description LOST_MC_ERR <24> RW1C Set when an error is detected but not logged because the associated symptom fields and registers are locked with the state of an earlier error. PIO_OVFL <23>...
  • Page 184: Pci Error Status Register 1

    31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 Failing Address ADDR<31:0> Table 6-6 PCI Error Status Register 1 Initial Name Bits Type State Description ADDR<31:0> <31:0> Contains address bits <31:0> of the transaction on the PCI bus when an error is detected. 6-14 AlphaServer 4000/4100 Service Manual...
  • Page 185: Chapter 7 Removal And Replacement

    Chapter 7 Removal and Replacement This chapter describes removal and replacement procedures for field-replaceable units (FRUs). 7.1 System Safety Observe the safety guidelines in this section to prevent personal injury. CAUTION: Wear an antistatic wrist strap whenever you work on a system. The AlphaServer cabinet system has a wrist strap connected to the frame at the front and rear.
  • Page 186: Fru List

    Figure 7-1 System Drawer FRU Locations Memory Modules CPU Modules Top Cover Optional and N+1 Rear Power Supplies PCI/EISA Options Power Supply Top Cover Front Fan Tray Power Cable OCP, Floppy, Chassis and CD-ROM Assembly PKW0452-96 AlphaServer 4000/4100 Service Manual...
  • Page 187: Field-Replaceable Unit Part Numbers

    Table 7-1 Field-Replaceable Unit Part Numbers CPU Modules B3001-CA 300 MHz CPU, uncached B3002-AB 300 MHz CPU, 2 Mbyte cache B3004-BA 300 MHz CPU, 2 Mbyte cache B3004-AA 400 MHz, 4 Mbyte cache B3004-DA 466 MHz, 4 Mbyte cache Memory Modules B3020-CA 64 Mbyte synch B3030-EA...
  • Page 188 2 meter IEC to IEC (Europe/AP, pedestal, and all cabinet systems.) 17-04285-03 IEC to IEC StorageWorks shelf Fan Tray Cables (Cabinet Only) 17-04324-01 Elec fan power harness 17-04325-01 12V power for SCM 17-04338-01 Power ground cable 17-04339-01 AC cable power AlphaServer 4000/4100 Service Manual...
  • Page 189 Table 7-1 Field-Replaceable Unit Part Numbers (continued) Server Control Module Power (Pedestal Only) 30-46485-01 110V North America 30-46485-02 220V Europe 30-46485-03 Australia/N.Z. 30-46485-04 220V U.K. System Drawer Cables and Jumpers From 17-04196-01 Server control Remote I/O SCM signal conn module signal signal conn on cable (60 pin) PCI mbrd...
  • Page 190 17-04351-01 SCM 12V power Power harness Sys fan 2 and SCM jumper (4100 & (17-04217-01) internal 12V conn early 4000 only) 17-04363-01 SCM 16-position SCM sig conn 16 pos conn on SCM jumper on PCI mbrd AlphaServer 4000/4100 Service Manual...
  • Page 191 Table 7-1 Field-Replaceable Unit Part Numbers (continued) Pedestal Cables From 17-04293-01 Elec harness Power harness Ped tray bulkhead power (17-04217-01) (system side) cable+5/+12 17-04302-01 OCP signal cable OCP sig conn OCP sig conn on ped tray on PCI mbrd bulkhead (system side) 17-04305-01 Harness power Power conn...
  • Page 192: 4100 Power System Frus

    Amer./Japan only and has six NEMA outlets and a 15 ft. cord to the 12-45334-02 wall outlet; the 12-45334-02 is used on pedestals in Eur./AP and on cabinet systems worldwide and has six IEC320 outlets. In pedestal systems, cords match country-specific wall outlets. AlphaServer 4000/4100 Service Manual...
  • Page 193 Part Number Description 17-04285-01 Power cord from AC input box to power strip. .5 meter, IEC320 to IEC320 connector used in cabinet systems only. In pedestal systems, cords match country-specific wall outlets. 1, 2, H7600-AA Power controller used in place of 30-45353-01, 12-45334-02, and 17-04285-02 in the H9A10-EL cabinet in N.
  • Page 194: 4000 Power System Frus

    Amer./Japan only and has six NEMA outlets and a 15 ft. cord to the 12-45334-02 wall outlet; the 12-45334-02 is used on pedestals in Eur./AP and on cabinet systems worldwide and has six IEC320 outlets. In pedestal systems, cords match country-specific wall outlets. 7-10 AlphaServer 4000/4100 Service Manual...
  • Page 195 Part Number Description 17-04285-01 Power cord from AC input box to power strip. .5 meter, IEC320 to IEC320 connector used in cabinet systems only. In pedestal systems, cords match country-specific wall outlets. 1, 2, H7600-AA Power controller used in place of 30-45353-01, 12-45334-02, and 17-04285-02 in the H9A10-EL cabinet in N.
  • Page 196: System Drawer Exposure (Cabinet)

    Figure 7-4 Exposing System Drawer (H9A10-EB & -EC Cabinet) Shipping Brackets on Cabinet Rails System Bus Cover Power Section Cover PCI Bus Cover PKW0404-96 7-12 AlphaServer 4000/4100 Service Manual...
  • Page 197 Exposing the System Bus or PCI Bus Card Cages 1. Open the front and rear doors of the cabinet. 2. At the front of the cabinet, unplug the drawer’s power supplies. 3. At the rear, remove the two Phillips screws holding the shipping bracket on the right rail so that the drawer can be pulled out.
  • Page 198: Cabinet Drawer Exposure (H9A10-El & Em)

    A stabilizer bar must be pulled out from the bottom to pevent the cabinet from tipping over. Figure 7-5 Exposing System Drawer (H9A10-EL & -EM Cabinet) PCI (4000) Cover PKW0457-97 7-14 AlphaServer 4000/4100 Service Manual...
  • Page 199 CAUTION: The cabinet could tip over if a system drawer is pulled out and the stablizing bar is not fully extended and its leveler foot on the floor. Exposing any section of the system drawer in an H9A10-EL or -EM Cabinet. 1.
  • Page 200: System Drawer Exposure (Pedestal)

    7.6 System Drawer Exposure (Pedestal) Figure 7-5 Exposing System Drawer (Pedestal) Pedestal Tray System Bus Cover Cover Pedestal Tray and Power Section Cover PCI Bus Cover 3.: 7-16 AlphaServer 4000/4100 Service Manual...
  • Page 201 Exposing the System Drawer 1. Open the front door and remove it by lifting and pulling it away from the system. 2. Remove the top cover. Unscrew the two Phillips head screws midway up on each side of the pedestal, tilt the cover up, and lift it away from the frame. 3.
  • Page 202: Cpu Removal And Replacement

    Figure 7-6 Removing CPU Module CPU Module System Bus Card Cage PKW0411-96 WARNING: CPU modules and memory modules have parts that operate at high temperatures. Wait 2 minutes after power is removed before touching any module. 7-18 AlphaServer 4000/4100 Service Manual...
  • Page 203 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 204: Cpu Fan Removal And Replacement

    7.8 CPU Fan Removal and Replacement Figure 7-7 Removing CPU Fan PKW411A-96 7-20 AlphaServer 4000/4100 Service Manual...
  • Page 205 Removal 1. Follow the CPU Removal and Replacement procedure. 2. Unplug the fan from the module. 3. Remove the four Phillips head screws holding the fan to the Alpha chip’s heatsink. Replacement Reverse the above procedure. Verification If the system powers up, the CPU fan is working. Removal and Replacement 7-21...
  • Page 206: Memory Removal And Replacement

    Figure 7-8 Removing Memory Module Memory Module System Bus Card Cage PKW0408-96 WARNING: CPU modules and memory modules have parts that operate at high temperatures. Wait 2 minutes after power is removed before touching any module. 7-22 AlphaServer 4000/4100 Service Manual...
  • Page 207 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 208: Power Control Module Removal And Replacement

    7.10 Power Control Module Removal and Replacement Figure 7-9 Removing Power Control Module Power Control Module (PCM) 6\VWHP %XV &DUG &DJH PKW0412 -96 7-24 AlphaServer 4000/4100 Service Manual...
  • Page 209 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 210: Removing System Bus To Pci/Eisa Bus Bridge Module (B3040-Aa)

    7.11 System Bus to PCI Bus Bridge (B3040-AA) Module Removal and Replacement Figure 7-10 Removing System Bus to PCI/EISA Bus Bridge Module (B3040-AA) PKW0413-96 7-26 AlphaServer 4000/4100 Service Manual...
  • Page 211 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 212: Removing System Bus To Pci Bus Bridge Module (B3040-Ab)

    7.12 System Bus to PCI Bus Bridge (B3040-AB) Module Removal and Replacement Figure 7-11 Removing System Bus to PCI Bus Bridge Module (B3040-AB) PKW0413A-96 7-28 AlphaServer 4000/4100 Service Manual...
  • Page 213 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the system bus card cage. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 214: System Motherboard (4100 &Early 4000) Removal And Replacement

    3. Expose the system bus card cage by removing the two Phillips head screws holding it in place and sliding the cover off the drawer. 4. Remove all CPUs, memory modules, and the PCM from the system motherboard. 7-30 AlphaServer 4000/4100 Service Manual...
  • Page 215 5. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 6. Remove all the PCI/EISA options. 7. Remove the server control module. 8. Remove the PCI motherboard. 9.
  • Page 216 4. Remove all CPUs, memory modules, and the PCM from the system motherboard. 5. Expose both PCI bus card cages. Remove three Phillips head screws holding each cover in place and slide them off the drawer. 7-32 AlphaServer 4000/4100 Service Manual...
  • Page 217 6. Remove all the PCI/EISA options. 7. Remove the server control module. 8. Remove the PCI motherboards. 9. Remove both bridge modules from the system motherboard. 10. Remove the bracket holding the power cables in place as they pass from the system bus section to the power section of the drawer.
  • Page 218: Pci/Eisa Motherboard (B3050) Removal And Replacement

    These environment variables are used to display the system model number and type, and they compute certain information passed to the operating system. When you replace the PCI 7-34 AlphaServer 4000/4100 Service Manual...
  • Page 219 motherboard, these environment variables are lost and must be restored after the module swap. 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer.
  • Page 220: Pci Motherboard (B3051) Removal And Replacement

    7.16 PCI Motherboard (B3051) Removal and Replacement Figure 7-15 Replacing PCI Motherboard PKW0409A-96 7-36 AlphaServer 4000/4100 Service Manual...
  • Page 221 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage on the right when viewing the drawer from the rear. Remove three Phillips head screws holding the cover in place, and slide it off the drawer.
  • Page 222: Server Control Module Removal And Replacement

    7.17 Server Control Module Removal and Replacement Figure 7-16 Removing Server Control Module SCM Bulkhead Connectors Keyboard COM1 Parallel 12VDC Mouse COM2 Modem PCI Motherboard Connectors Diskette Drive Remote I/O CD-ROM Drive PKW0415-96 7-38 AlphaServer 4000/4100 Service Manual...
  • Page 223 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 224: Pci/Eisa Option Removal And Replacement

    Figure 7-17 Removing PCI/EISA Option PKW0418-96 WARNING: To prevent fire, use only modules with current limited outputs. See National Electrical Code NFPA 70 or Safety of Information Technology Equipment, Including Electrical Business Equipment EN 60 950. 7-40 AlphaServer 4000/4100 Service Manual...
  • Page 225 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the PCI bus card cage. Remove three Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 226: Power Supply Removal And Replacement

    7.19 Power Supply Removal and Replacement Figure 7-18 Removing Power Supply Jumper 17-04199-01 Cable Harness 17-04217-01 or 17-04358-01 PKW0410-96 7-42 AlphaServer 4000/4100 Service Manual...
  • Page 227 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the cover to the power section of the drawer. Remove the two Phillips head screws holding the cover in place and slide it off the drawer. 4.
  • Page 228: Power Harness (4100 & Early 4000) Removal And Replacement

    Figure 7-19 Removing Power Harness Holding Bracket System Bus Motherboard Fans Power Supplies PCI Bus M otherboard OCP T ray T o OCP PCI Bus Motherboard CD-ROM T o Floppy T o OCP OCP T ray PKW 0419-96 7-44 AlphaServer 4000/4100 Service Manual...
  • Page 229 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the power, system card cage, and PCI/EISA sections of the drawer by removing all covers. Unscrew the Phillips head screws holding each cover in place and slide the covers off the drawer.
  • Page 230 Figure 7-20 Removing Power Harness System Bus Motherboard Holding Bracket PCI Bus2 & 3 Motherboard Fans Power Supplies PCI Bus0 & 1 Motherboard OCP Tray To OCP CD-ROM To Floppy To OCP OCP Tray PKW0419T-97 7-46 AlphaServer 4000/4100 Service Manual...
  • Page 231 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Expose the power and system card cage sections of the drawer by removing the two covers. Unscrew the two Phillips head screws holding each cover in place and slide the covers off the drawer.
  • Page 232: System Drawer Fan Removal And Replacement

    Unscrew the two Phillips head screws holding each cover on top of the drawer in place and slide them off the drawer. Release the two lever latches holding the PCI card cage cover in place and slide it off. 7-48 AlphaServer 4000/4100 Service Manual...
  • Page 233 4. Release the power supply tray by removing the two Phillips head screws on the side of the drawer. 5. Lift the power supply tray to release it from the sheet metal and slide it out from the drawer. 6. Tilt the tray to allow easier access to the fans. 7.
  • Page 234: Cover Interlock (4100 & Early 4000) Removal And Replacement

    7.23 Cover Interlock (4100 and early 4000) Removal and Replacement Figure 7-22 Removing Cover Interlocks 3 Cover Interlock Switches 70-32016-01 To OCP PKW-0403D-96 7-50 AlphaServer 4000/4100 Service Manual...
  • Page 235 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove all three section covers to expose the interlock switch assembly. 4. Remove the two screws holding the interlock in place. 5. Push the interlock toward the opposite side of the system drawer (be sure not to twist it) and tilt it so that the switches affected by the power and system card cage covers clear the openings in the side of the drawer.
  • Page 236: Cover Interlock (Later 4000) Removal And Replacement

    7.24 Cover Interlock (later 4000) Removal and Replacement Figure 7-23 Removing Cover Interlocks 4 Cover Interlock Switches 70-33002-01 To OCP PKW-0403H-96 7-52 AlphaServer 4000/4100 Service Manual...
  • Page 237 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove all three section covers to expose the interlock switch assemblies. 4. Remove the two screws holding the interlocks in place. 5. Push the interlock toward the opposite side of the system drawer (be sure not to twist it) and tilt it so that the switches affected by the power and system card cage covers clear the openings in the side of the drawer.
  • Page 238: Operator Control Panel Removal And Replacement (Cabinet)

    7.25 Operator Control Panel Removal and Replacement (Cabinet) Figure 7-24 Removing OCP (Cabinet) PKW0417C-96 7-54 AlphaServer 4000/4100 Service Manual...
  • Page 239 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. While you need not remove the tray containing the OCP, you do need to slide it forward to access the OCP retaining screws under the tray. The tray is attached to the power system section cover.
  • Page 240: Operator Control Panel Removal And Replacement (Pedestal)

    7.26 Operator Control Panel Removal and Replacement (Pedestal) Figure 7-25 Removing OCP (Pedestal) 3.:  7-56 AlphaServer 4000/4100 Service Manual...
  • Page 241 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer far enough to disconnect cables attached to the OCP, the floppy, and the CD-ROM drive.
  • Page 242: Floppy Removal And Replacement

    7.27 Floppy Removal and Replacement Figure 7-26 Removing Floppy Drive PKW0417B-96 7-58 AlphaServer 4000/4100 Service Manual...
  • Page 243 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer and disconnect cables attached to the OCP (unnecessary on a pedestal system), the floppy, and the CD-ROM drive.
  • Page 244: Cd-Rom Removal And Replacement

    7.28 CD-ROM Removal and Replacement Figure 7-27 Removing CD-ROM PKW0417A-96 7-60 AlphaServer 4000/4100 Service Manual...
  • Page 245 Removal 1. Shut down the operating system and power down the system. 2. Expose the system drawer. 3. Remove the four Phillips head screws holding the OCP tray to the system drawer. 4. Slide the tray out of the system drawer and disconnect cables attached to the OCP (unnecessary on a pedestal system), the floppy, and the CD-ROM drive.
  • Page 246: Cabinet Fan Tray Removal And Replacement

    7.29 Cabinet Fan Tray Removal and Replacement Figure 7-28 Removing Cabinet Fan Tray Fan LED Power LED Power To SCM PKW0441A-96 7-62 AlphaServer 4000/4100 Service Manual...
  • Page 247 Removal 1. Shut down the operating system and power down the system. Unplug the AC power cable from the cabinet tray power supply. 2. If present, unplug any power cables going to the server control modules at the back of system drawers. 3.
  • Page 248: Cabinet Fan Tray Power Supply Removal And Replacement

    7.30 Cabinet Fan Tray Power Supply Removal and Replacement Figure 7-29 Removing Cabinet Fan Tray Power Supply Ground To fan fail detect board Offsets To fans Power supply Power PKW0441B-96 supply cover 7-64 AlphaServer 4000/4100 Service Manual...
  • Page 249 Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan fail detect module and each fan. 3. Remove the power supply cover. It is held in place by two screws that go through the AC bulkhead spot welded to the tray weldment. 4.
  • Page 250: Cabinet Fan Tray Fan Removal And Replacement

    7.31 Cabinet Fan Tray Fan Removal and Replacement Figure 7-30 Removing Cabinet Fan Tray Fan PKW0441F-96 7-66 AlphaServer 4000/4100 Service Manual...
  • Page 251 Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan you wish to replace. 3. Remove the fan finger guard. 4. Remove the two remaining screws holding the fan to the tray and remove the fan.
  • Page 252: Cabinet Fan Tray Fan Fail Detect Module Removal And Replacement

    7.32 Cabinet Fan Tray Fan Fail Detect Module Removal and Replacement Figure 7-31 Removing Fan Tray Fan Fail Detect Module PKW0441D-96 7-68 AlphaServer 4000/4100 Service Manual...
  • Page 253 Removal 1. Remove the cabinet fan tray. 2. Disconnect the power harness from the fan fail detect module. 3. Remove the fan fail detect module. In early systems, the module is held in place by three screws that go through the weldment, through three standoffs, through the module to nuts.
  • Page 254: Storageworks Shelf Removal And Replacement

    7.33 StorageWorks Shelf Removal and Replacement Figure 7-32 Removing StorageWorks Shelf Cabinet StorageWorks Shelf Mounting Rails StorageWorks Shelf (H910A-EC) Mounting Rails (H910A-EB) Pedestal PKW0451-96 7-70 AlphaServer 4000/4100 Service Manual...
  • Page 255 Removal 1. Shut down the operating system and power down the system. 2. Remove the power cord and signal cord(s) from the StorageWorks shelf. 3. Remove the two retaining brackets holding the shelf in the mounting rail by removing the Phillips head screws holding the brackets in place. 4.
  • Page 257: Running Utilities

    Appendix A Running Utilities This appendix provides a brief overview of how to load and run utilities. The following topics are covered: • Running Utilities from a Graphics Monitor • Running Utilities from a Serial Terminal • Running ECU • Running RAID Standalone Configuration Utility •...
  • Page 258: Running Utilities From A Graphics Monitor

    Figure A-1 Running a Utility from a Graphics Monitor AlphaBIOS Setup F1=Help Display System Configuration... Upgrade AlphaBIOS Hard Disk Setup... CMOS Setup... Install Windows NT Utilities Run ECU from floppy... About AlphaBIOS... OS Selection Setup... Run Maintenance Program... PK-0729-96 AlphaServer 4000/4100 Service Manual...
  • Page 259: Running Utilities From A Serial Terminal

    Running Utilities from a Serial Terminal Utilities are run from a serial terminal in the same way as from a graphics monitor. The menus are the same, but some keys are different. Table A-1 AlphaBIOS Option Key Mapping AlphaBIOS Key VTxxx Key Ctrl/A Ctrl/B...
  • Page 260: Running Ecu

    View or edit details STEP 4: Examine required details STEP 5: Save and exit NOTE: Step 1 of the ECU provides online help. It is recommended that you select this step and become familiar with the utility before proceeding. AlphaServer 4000/4100 Service Manual...
  • Page 261: Running Raid Standalone Configuration Utility

    Running RAID Standalone Configuration Utility The RAID Standalone Configuration Utility is used to set up RAID disk drives and logical units. The Standalone Utility is run from the AlphaBIOS Utility menu. The AlphaServer 4100 system supports the KZPSC- xx PCI RAID controller (SWXCR).
  • Page 262: Updating Firmware With Lfu

    Figure A-2 Starting LFU from the AlphaBIOS Console AlphaBIOS Setup Display System Configuration... Upgrade AlphaBIOS Hard Disk Setup CMOS Setup... Install Windows NT Utilities About AlphaBIOS... Press ENTER to upgrade your AlphaBIOS from floppy or CD-ROM. ESC=Exit PK-0726A-96 AlphaServer 4000/4100 Service Manual...
  • Page 263 Use the Loadable Firmware Update (LFU) utility to update system firmware. You can start LFU from either the SRM console or the AlphaBIOS console. • From the SRM console, start LFU by issuing the lfu command. • From the AlphaBIOS console, select Upgrade AlphaBIOS from the AlphaBIOS Setup screen (see Figure A-2).
  • Page 264: Updating Firmware From The Internal Cd-Rom

    Replaces current firmware with loadable data image. Verify Compares loadable and hardware images. ? or Help Scrolls this function table. ----------------------------------------------------------------- UPD> list Device Current Revision Filename Update Revision AlphaBIOS V5.12-2 arcrom V6.40-1 srmflash V1.0-9 srmrom V2.0-3 Continued next page AlphaServer 4000/4100 Service Manual...
  • Page 265 Select the device from which firmware will be loaded. The choices are the internal CD-ROM, the internal floppy disk, or a network device. In this example, the internal CD-ROM is selected. Select the file that has the firmware update, or press Enter to select the default file.
  • Page 266 Confirm update on: AlphaBIOS [Y/(N)] y DO NOT ABORT! AlphaBIOS Updating to V6.40-1... Verifying V6.40-1... PASSED. Confirm update on: srmflash [Y/(N)] y DO NOT ABORT! srmflash Updating to V2.0-3... Verifying V2.0-3... PASSED. UPD> exit A-10 AlphaServer 4000/4100 Service Manual...
  • Page 267 The update command updates the device specified or all devices. In this example, the wildcard indicates that all devices supported by the selected update file will be updated. For each device, you are asked to confirm that you want to update the firmware.
  • Page 268: Updating Firmware From The Internal Floppy Disk - Creating The Diskettes

    1. Download the update files from the Internet (see the Preface of this book). 2. On a PC, copy files onto two FAT-formatted diskettes. From an OpenVMS system, copy files onto two ODS2-formatted diskettes as shown in Example A-3. A-12 AlphaServer 4000/4100 Service Manual...
  • Page 269 Example A-3 Creating Update Diskettes on an OpenVMS System Console Update Diskette $ inquire ignore "Insert blank HD floppy in DVA0, then continue" $ set verify $ set proc/priv=all $ init /density=hd/index=begin dva0: rhods2cp $ mount dva0: rhods2cp $ create /directory dva0:[as4x00] $ copy as4x00fw.sys dva0:[as4x00]as4x00fw.sys $ copy as4x00cp.sys dva0:[as4x00]as4x00cp.sys $ copy rhreadme.sys dva0:[as4x00]rhreadme.sys...
  • Page 270: Updating Firmware From The Internal Floppy Disk - Performing The Update

    . (The function table displays, followed by the UPD> prompt, as shown in Example A-2.) UPD> list Device Current Revision Filename Update Revision AlphaBIOS V5.12-3 arcrom Missing file pfi0 2.46 dfpaa_fw 2.52 srmflash T3.2-21 srmrom Missing file cipca_fw A214 kzpsa_fw Continued on next page A-14 AlphaServer 4000/4100 Service Manual...
  • Page 271 Select the device from which firmware will be loaded. The choices are the internal CD-ROM, the internal floppy disk, or a network device. In this example, the internal floppy disk is selected. Select the file that has the firmware update, or press Enter to select the default file.
  • Page 272 Please enter the name of the options firmware files list, or Press <return> to use the default filename [AS4X00IO,(AS4X00CP)]: . (The function table displays, followed by the UPD> prompt. . Console firmware can now be updated.) UPD> exit A-16 AlphaServer 4000/4100 Service Manual...
  • Page 273 The update command updates the device specified or all devices. For each device, you are asked to confirm that you want to update the firmware. The default is no. Once the update begins, do not abort the operation. Doing so will corrupt the firmware on the module. The lfu command restarts the utility so that console firmware can be updated.
  • Page 274 . [The function table displays, followed by the UPD> prompt, as . shown in Example A-2.] UPD> list Device Current Revision Filename Update Revision AlphaBIOS V5.12-2 arcrom V6.40-1 kzpsa0 kzpsa_fw kzpsa1 kzpsa_fw srmflash V1.0-9 srmrom V2.0-3 cipca_fw A214 dfpaa_fw 2.46 Continued on next page A-18 AlphaServer 4000/4100 Service Manual...
  • Page 275 Before starting LFU, download the update files from the Internet (see Preface). You will need the files with the extension .SYS. Copy these files to your local MOP server’s MOP load area. Select the device from which firmware will be loaded. The choices are the internal CD-ROM, the internal floppy disk, or a network device.
  • Page 276: Updating Firmware From A Network Device

    Updating to V6.40-1... Verifying V6.40-1... PASSED. DO NOT ABORT! kzpsa0 Updating to A11 ... Verifying A11... PASSED. DO NOT ABORT! kzpsa1 Updating to A11 ... Verifying A11... PASSED. DO NOT ABORT! srmflash Updating to V2.0-3... Verifying V2.0-3... PASSED. UPD> exit A-20 AlphaServer 4000/4100 Service Manual...
  • Page 277 The update command updates the device specified or all devices. In this example, the wildcard indicates that all devices supported by the selected update file will be updated. Typically, LFU requests confirmation before updating each console’s or device’s firmware. The -all option removes the update confirmation requests.
  • Page 278: Lfu Commands

    Lists release notes for the LFU program. update Writes new firmware to the module. verify Reads the firmware from the module into memory and compares it with the update firmware. These commands are described in the following pages. A-22 AlphaServer 4000/4100 Service Manual...
  • Page 279 display The display command shows the system physical configuration. Display is equivalent to issuing the SRM console command show configuration. Because it shows the slot for each module, display can help you identify the location of a device. exit The exit command terminates the LFU program, causes system initialization and testing, and returns the system to the console from which LFU was called.
  • Page 280 The verify command reads the firmware from the module into memory and compares it with the update firmware. If a module already verified successfully when you updated it, but later failed tests, you can use verify to tell whether the firmware has become corrupted. A-24 AlphaServer 4000/4100 Service Manual...
  • Page 281: Updating Firmware From Alphabios

    A.6 Updating Firmware from AlphaBIOS Insert the CD-ROM or diskette with the updated firmware and select Upgrade AlphaBIOS from the main AlphaBIOS Setup screen. Use the Loadable Firmware Update (LFU) utility to perform the update. The LFU exit command causes a system reset. Figure A-3 AlphaBIOS Setup Screen AlphaBIOS Setup Display System Configuration...
  • Page 282: Upgrading Alphabios

    4. When the upgrade is complete, issue the LFU exit command. The system is reset and you are returned to AlphaBIOS. If you press the Reset button instead of issuing the LFU exit command, the system is reset and you are returned to LFU. A-26 AlphaServer 4000/4100 Service Manual...
  • Page 283: Srm Console Commands And Environment Variables

    The test command is described in Chapter 3 of this document. For complete reference information on the other SRM commands and environment variables, see the AlphaServer 4000/4100 System Drawer User’s Guide. NOTE: It is recommended that you keep a list of the environment variable settings for systems that you service, because you will need to restore certain environment variable settings after swapping modules.
  • Page 284: Summary Of Srm Console Commands

    Info 5 reads the PAL built logout area that contains the data used by the operating system to create the error entry Info 8 reads the IOD and IOD1 registers. initialize Resets the system. Runs the Loadable Firmware Update Utility. Continued on next page AlphaServer 4000/4100 Service Manual...
  • Page 285 Table B-1 Summary of SRM Console Commands (Continued) Command Function Displays information about the specified console command. more Displays a file one screen at a time. prcache Initializes and displays status of the PCI NVRAM. set envar Sets or modifies the value of an environment variable. set host Connects to an MSCP DUP server on a DSSI device.
  • Page 286: Environment Variable Summary

    Specifies network protocols for booting over the Ethernet controller. kbd_hardware_ Specifies the default console keyboard type. type kzpsa*_host_id Specifies the default value for the KZPSA host SCSI bus node ID. language Specifies the console keyboard layout. Continued on next page AlphaServer 4000/4100 Service Manual...
  • Page 287 Table B-2 Environment Variable Summary (Continued) Environment Variable Function memory_test Specifies the extent to which memory will be tested. For DIGITAL UNIX systems only. ocp_text Overrides the default OCP display text with specified text. os_type Specifies the operating system and sets the appropriate console interface.
  • Page 288: Environment Variables Worksheet

    Table B-3 Environment Variables Worksheet Environment Variable System Name System Name System Name auto_action bootdef_dev boot_osflags com2_baud console cpu_enabled ew*0_mode ew*0_protocols kbd_hardware_ type kzpsa*_host_id language memory_test ocp_text os_type pci_parity pk*0_fast pk*0_host_id pk*0_soft_term AlphaServer 4000/4100 Service Manual...
  • Page 289 Table B-3 Environment Variables Worksheet (Continued) Environment Variable System Name System Name System Name pk*0_soft_term sys_model_num sys_serial_num sys_type tga_sync_green tt_allow_login SRM Console Commands and Environment Variables...
  • Page 291: Appendix C Operating The System Remotely

    Appendix C Operating the System Remotely This appendix describes how to use the remote console monitor (RCM) to monitor and control the system remotely . C.1 RCM Console Overview The remote console monitor (RCM) is used to monitor and control the system remotely.
  • Page 292: Modem Usage

    To use the RCM to monitor a system remotely, first make the connections to the server control module, as shown below. Then configure the modem port for dial-in. Figure C-1 RCM Connections ConsoleTerminal PhoneJack External Power Supply Modem PK-0651-96 C-2 AlphaServer 4000/4100 Service Manual...
  • Page 293 Modem Selection The RCM requires a Hayes-compatible modem. The controls that the RCM sends to the modem have been selected to be acceptable to a wide selection of modems. The modems that have been tested and qualified include: Motorola LifeStyle Series 28.8 AT&T DATAPORT 14.4/FAX Zoom Model 360 The U.S.
  • Page 294 This process can take a minute or more, and the local terminal will be locked out until the auto hangup process completes. If the modem link is idle for more than 20 minutes, the RCM initiates an auto hangup. C-4 AlphaServer 4000/4100 Service Manual...
  • Page 295: Entering And Leaving Command Mode

    C.1.2 Entering and Leaving Command Mode Use the default escape sequence to enter RCM command mode for the first time. You can enter RCM command mode from the SRM console level, the operating system level, or an application. The RCM quit command reconnects the terminal to the system console port.
  • Page 296: Rcm Commands

    Exits console mode and returns to system console port reset Resets the server setesc Changes the escape sequence for entering command mode setpass Changes the modem access password status Displays server’s status and sensors C-6 AlphaServer 4000/4100 Service Manual...
  • Page 297 Command Conventions • The commands are not case sensitive. • A command must be entered in full. • If a command is entered that is not valid, the command fails with the message: *** ERROR - unknown command *** Enter a valid command. The RCM commands are described on the following pages.
  • Page 298 A modem dial-out string must be entered with the system console. • Remote access to the RCM modem port must be enabled with the enable command. If the alert_enable command is entered when remote access is disabled, the following message is returned: *** error *** C-8 AlphaServer 4000/4100 Service Manual...
  • Page 299 C.1.3.4 disable The disable command disables remote access to the RCM modem port. RCM>disable The module’s remote access default state is DISABLED. The modem enable state is nonvolatile. When the modem is disabled, it remains disabled until the enable command is issued. If a modem connection is in progress, entering the disable command terminates it.
  • Page 300 The external power to the RCM must be connected in order to power off the system from the RCM firmware console. If the external power supply is not connected, the command will not power the system down, and displays the message: *** ERROR *** C-10 AlphaServer 4000/4100 Service Manual...
  • Page 301 C.1.3.10 poweron The poweron command requests the RCM module to power on the system. For the system power to come on, the following conditions must be met: • AC power must be present at the power supply inputs. • The DC On/Off button must be in the “on” position. •...
  • Page 302 Although the module factory defaults can be restored if the user has forgotten the escape sequence, this involves accessing the server control module and moving a jumper. The following sample escape sequence consists of five iterations of the Ctrl key and the letter “o”. RCM>setesc ^o^o^o^o^o RCM> C-12 AlphaServer 4000/4100 Service Manual...
  • Page 303 If the escape sequence entered exceeds 15 characters, the command fails with the message: *** ERROR *** When changing the default escape sequence, avoid using special characters that are used by the system’s terminal emulator or applications. Control characters are not echoed when entering the escape sequence. To verify the complete escape sequence, use the status command.
  • Page 304: Rcm Status Command Fields

    Current system temperature in degrees Celsius. RCM Power Control: Current state of RCM system power control. (ON/OFF) External Power: Current state of power from external power supply to server control module. (ON/OFF) Server Power: Current state of system power. (ON/OFF) C-14 AlphaServer 4000/4100 Service Manual...
  • Page 305: Dial-Out Alerts

    C.1.4 Dial-Out Alerts The RCM can be configured to automatically dial out through the modem (usually to a paging service) when it detects a power failure within the system. When a dial-out alert is triggered, the RCM initializes the modem for dial-out, sends the dial-out string, hangs up the modem, and reconfigures the modem for dial-in.
  • Page 306 2. Enter the RCM firmware console and enter the enable command to enable remote access dial-in. The RCM firmware status command should display “Remote Access: ENABLE.” (See 3. Enter the RCM firmware alert_ena command to enable outgoing alerts. (See C-16 AlphaServer 4000/4100 Service Manual...
  • Page 307 Composing a Modem Dial-Out String The modem dial-out string emulates a user dialing an automatic paging service. Typically, the user dials the pager phone number, waits for a tone, and then enters a series of numbers. The RCM dial-out string (Example C-4) has the following requirements: •...
  • Page 308: Resetting The Rcm To Factory Defaults

    7. Power up the system to the SRM console prompt and type the default escape sequence to enter RCM command mode: ^]^]RCM 8. Configure the module as desired. You must reset the password and modem enable states in order to enable remote access. C-18 AlphaServer 4000/4100 Service Manual...
  • Page 309: Troubleshooting Guide

    C.1.6 Troubleshooting Guide Table C-3 lists a number of possible causes and suggested solutions for symptoms you might see. Table C-3 RCM Troubleshooting Symptom Possible Cause Suggested Solution The local terminal System and terminal baud rate Set the system and will not communi- set incorrectly.
  • Page 310 “new line” is not displayed when the return. selected. user enters a carriage return by itself. Continued on next page C-20 AlphaServer 4000/4100 Service Manual...
  • Page 311 Table C-3 RCM Troubleshooting (Continued) Symptom Possible Cause Suggested Solution Cannot enable The modem is not configured Modify the modem modem or modem correctly to work with the initialization and/or answer will not answer. RCM. string. Operating the System Remotely C-21...
  • Page 312: Modem Dialog Details

    Guard-band = 1 second (S12=50) Fixed modem-to-RCM baud rate Connect at highest possible reliability and speed The RCM expects to receive a “0<cr>” (OK) in response to the initialization string. If it does not, the enable command will fail. C-22 AlphaServer 4000/4100 Service Manual...
  • Page 313 This default initialization string works on a wide variety of modems. If your modem does not configure itself to these parameters, the initialization string will need to be modified. See the topic in this section entitled Modifying Initialization and Answer Strings.
  • Page 314: Rcm/Modem Interchange Summary

    SRM set and show commands are provided to enable the user to define and examine the initialization and answer strings. To replace the initialization string : P00>>> set rcm_init "new_init_string" To replace the answer string: P00>>> set rcm_answer "new_answer_string" C-24 AlphaServer 4000/4100 Service Manual...
  • Page 315 To display all the RCM user settable strings: P00>>> show rcm* rcm_answer ATXA rcm_dialout rcm_init AT&F0EVS0=0S12=50 P00>>> Initialization and Answer String Substitutions The RCM default initialization and answer strings are as follows: Initialization String: “AT&F0EVS0=0S12=50” Answer String: “ATXA” The following modem requires a modified answer string. Initialization String Answer String USRobotics Sportster...
  • Page 317: Index

    Index B3050-AA PCI motherboard, 1-30, 7- ? command, RCM, C-10 B3051-AA PCI motherboard, 7-3 BA30A system drawer, 1-2 BA30B system drawer, 1-6 BA30C system drawer, 1-4 4000 system drawer, 1-4, 1-6 B-cache, 2-21, 2-23 4100 system drawer, 1-2 Bridge module (B3040-AA) removal and replacement, 7-26 Bridge module (B3040-AB) alert_clr command, RCM, C-8...
  • Page 318 CD-ROM removal and replacement, 7-60 ECC syndrome bits, 5-53 COM1 port, 2-19 ECU, running, A-4 Command codes, 5-54 EL_ADDR Register, 6-6 Command summary (SRM), B-2 EL_STAT Register, 6-2 Components enable command, RCM, C-9 housed in system drawer, 1-2, 1- Environment variables 4, 1-6 SRM console, B-4 Console...
  • Page 319: Info 3 Command

    updating from network device, A- LCD, 2-2 updating, AlphaBIOS selection, LEDs troubleshooting with, 3-2 updating, SRM command, A-6 LEDs, fan and power in cabinet, 3-5 Floppy removal and replacement, 7-58 exit command, A-25 FRU list, 7-2 starting, A-6, A-8 4000 power system, 7-10 starting the utility, A-6 4100 power system, 7-8 typical update procedure, A-8...
  • Page 320: Memory Tests

    MCHK 620 correctable error, 5-44 MCHK 630 correctable CPU error, 5- Page table entry invalid error, 5-51 PALcode, 2-23 MCHK 660 IOD detected failure, 5- PALcode, described, 5-56 27, 5-32 PCI Error Status Register 1, 6-14 MCHK 670 CPU and IOD detected PCI I/O subsystem, 1-30 failure, 5-16 PCI master abort, 5-51...
  • Page 321: Server Control Module

    voltages, 4-3 setesc, C-12 Power system components, 7-4 setpass, C-13 poweroff command, RCM, C-10 status, C-14 poweron command, RCM, C-11 rcm_dialout command, C-15 Power-up readme command (LFU), A-24, A-26 SROM and XSROM messages Redundant power, 1-37 during, 2-19 Registers, 6-1 Power-up display, 2-20 Remote console monitor.
  • Page 322 System bus ECC error, 5-47 Test pci command, 3-17 System bus nonexistent address error, Troubleshooting failures at power-up, 3-7 5-48 System bus to PCI bus bridge module, IOD detected errors, 5-46 1-17, 1-28 power problems, 3-6 System bus to PCI/EISA bus bridge using error logs, 5-2 module, 1-17 System consoles, 1-14...

This manual is also suitable for:

Alphaserver 4100Ba30aBa30cBa30b

Table of Contents