Table of Contents

Advertisement

Quick Links

Digital Semiconductor
Alpha 21164PC Microprocessor
Hardware Reference Manual
Order Number: EC–R2W0A–TE
Revision/Update Information: This is a preliminary document.
Preliminary
Digital Equipment Corporation
Maynard, Massachusetts
http://www.digital.com/semiconductor

Advertisement

Table of Contents
loading
Need help?

Need help?

Do you have a question about the Alpha 21164PC and is the answer not in the manual?

Questions and answers

Summary of Contents for Digital Equipment Alpha 21164PC

  • Page 1 Digital Semiconductor Alpha 21164PC Microprocessor Hardware Reference Manual Order Number: EC–R2W0A–TE Revision/Update Information: This is a preliminary document. Preliminary Digital Equipment Corporation Maynard, Massachusetts http://www.digital.com/semiconductor...
  • Page 2 Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description.
  • Page 3: Table Of Contents

    Contents Preface Introduction The Architecture ..........1.1.1 Addressing.
  • Page 4 Pipeline Organization ..........2-13 2.2.1 Pipeline Stages and Instruction Issue .
  • Page 5 Clocks ............4.2.1 CPU Clock .
  • Page 6 4.11 21164PC/System Race Conditions........4-51 4.11.1 Rules for 21164PC and System Use of External Interface .
  • Page 7 5.1.24 Interrupt Summary (ISR) Register (100) ......5-23 5.1.25 Serial Line Transmit (SL_XMIT) Register (116)..... . . 5-25 5.1.26 Serial Line Receive (SL_RCV) Register (117).
  • Page 8 Invoking PALcode ..........PALcode Entry Points .
  • Page 9 Electrical Data Electrical Characteristics..........DC Characteristics .
  • Page 10 Testability and Diagnostics 12.1 Test Port Pins ........... . 12-1 12.2 Test Interface .
  • Page 11 Figures 2–1 21164PC Microprocessor Block/Pipe Flow Diagram ..... . 2–2 Instruction Pipeline Stages ......... . 2-14 2–3 Floating-Point Control Register (FPCR) Format .
  • Page 12 5–9 Virtual Page Table Base (IVPTBR) Register (NT_Mode=1) ....5-10 5–10 Icache Parity Error Status (ICPERR_STAT) Register..... . 5-11 5–11 Exception Address (EXC_ADDR) Register.
  • Page 13 6–4 HW_MFPR and HW_MTPR Instruction Format ......6-12 9–1 osc_clk_in_h,l Input Network and Terminations ......9–2 Impedance vs Clock Input Frequency.
  • Page 14 Tables 2–1 Effect of Branching Instructions on the Branch—Prediction Stack ... . 2–2 Pipeline Examples—All Cases ........2-15 2–3 Pipeline Examples—Integer Add .
  • Page 15 5–22 Dcache Test Tag Control Register Fields ....... 5-52 5–23 Dcache Test Tag Register Fields .
  • Page 16 A–4 Opcodes Reserved for PALcode ........A-10 A–5 IEEE Floating-Point Instruction Function Codes .
  • Page 17 Preface This manual provides information about the architecture, internal design, external interface, and specifications of the Digital Semiconductor Alpha 21164PC micropro- cessor (referred to as the 21164PC) and its associated software. Audience This reference manual is for system designers and programmers who use the 21164PC.
  • Page 18 Chapter 7, Initialization and Configuration, describes the initialization and con- • figuration sequence. Chapter 8, Error Detection and Error Handling, describes error detection and • error handling. Chapter 9, Electrical Data, provides electrical data and describes signal integrity • issues. Chapter 10, Thermal Management, provides information about thermal manage- •...
  • Page 19 Conventions This section defines product-specific terminology, abbreviations, and other conven- tions used throughout this manual. Abbreviations • Binary Multiples The abbreviations K, M, and G (kilo, mega, and giga) represent binary multiples and have the following values. (1024) (1,048,576) (1,073,741,824) For example: = 2 ×...
  • Page 20 RC — Read To Clear A register field specified as RC is written by hardware and remains unchanged until read. The value may be read by software, at which point, hardware may write a new value into the field. RES — Reserved Bits and fields specified as RES are reserved by Digital Semiconductor and should not be used;...
  • Page 21 Bit Notation Multiple-bit fields can include contiguous and noncontiguous bits contained in angle brackets (<>). Multiple contiguous bits are indicated by a pair of numbers separated by a colon (:). For example, <9:7,5,2:0> specifies bits 9,8,7,5,2,1, and 0. Similarly, single bits are frequently indicated with angle brackets. For example, <27> specifies bit 27.
  • Page 22 Security Holes Security holes exist when unprivileged software (that is, software that is running out- side of kernel mode) can: Affect the operation of another process without authorization from the operating • system. • Amplify its privilege without authorization from the operating system. •...
  • Page 23 An UNPREDICTABLE result may acquire an arbitrary value subject to a few • constraints. Such a result may be an arbitrary function of the input operands or of any state information that is accessible to the process in its current access mode. UNPREDICTABLE results may be unchanged from their previous values.
  • Page 25: Introduction

    Equipment Corporation’s RISC (reduced instruction set computing) architecture designed for high performance. The chapter then summarizes the specific features of the Digital Semiconductor Alpha 21164PC microprocessor (hereafter called the 21164PC) that implements the Alpha architecture. Appendix A provides a list of Alpha instructions.
  • Page 26: Addressing

    The Architecture The 21164PC uses a set of subroutines, called privileged architecture library code (PALcode), that is specific to a particular Alpha operating system implementation and hardware platform. These subroutines provide operating system primitives for context switching, interrupts, exceptions, and memory management. These subrou- tines can be invoked by hardware or CALL_PAL instructions.
  • Page 27: Integer Data Types

    The Architecture 1.1.2 Integer Data Types Alpha architecture supports four integer data types. Data Type Description Byte A byte is eight contiguous bits that start at an addressable byte boundary. A byte is an 8-bit value. A byte is supported in Alpha architecture by the EXTRACT, INSERT, LDBU, MASK, SEXTB, STB, ZAP, PACK, UNPACK, MIN, MAX, and PERR instructions.
  • Page 28: 21164Pc Microprocessor Features

    21164PC Microprocessor Features • VAX floating-point formats – F_floating – G_floating – D_floating (limited support) 1.2 21164PC Microprocessor Features The 21164PC is a superscalar pipelined processor manufactured using 0.35-µm CMOS technology. It is packaged in a 413-pin IPGA carrier and has removable application-specific heat sinks.
  • Page 29 21164PC Microprocessor Features • An onchip, dual-read-ported, 8KB data cache. • An onchip write buffer with six 32-byte entries. • A 128-bit data bus with onchip parity and offchip longword parity. • Support for an external second-level cache. The size and access time of the external second-level cache is programmable.
  • Page 31: Internal Architecture

    Internal Architecture This chapter provides both an overview of the 21164PC microarchitecture and a sys- tem designer’s view of the 21164PC implementation of the Alpha architecture. The combination of the 21164PC microarchitecture and privileged architecture library code (PALcode) defines the chip’s implementation of the Alpha architecture. If a certain piece of hardware seems to be “architecturally incomplete,”...
  • Page 32: 21164Pc Microarchitecture

    21164PC Microarchitecture 2.1 21164PC Microarchitecture The 21164PC microprocessor is a high-performance implementation of Digital Equipment Corporation’s Alpha architecture. Figure 2–1 is a block diagram of the 21164PC that shows the major functional blocks relative to pipeline stage flow. The following paragraphs provide an overview of the chip’s architecture and major func- tional units.
  • Page 33: Instruction Fetch/Decode Unit And Branch Unit

    21164PC Microarchitecture – Instruction translation buffer – Branch prediction – Instruction slotting/issue – Interrupt support • Integer execution unit (IEU) (Section 2.1.2) • Floating-point execution unit (FPU) (Section 2.1.3) • Memory address translation unit (MTU) (Section 2.1.4), which includes: – Data translation buffer (DTB) –...
  • Page 34 21164PC Microarchitecture 2.1.1.1 Instruction Decode and Issue The IDU decodes up to four instructions in parallel and checks that the required resources are available for each instruction. The IDU issues only the instructions for which all required resources are available. The IDU does not issue instructions out of order, even if the resources are available for a later instruction and not for an earlier one.
  • Page 35 21164PC Microarchitecture Prefetching does not begin until there is a “true” miss. A true miss is a reference that misses in the Icache and then also misses in the refill buffer. If an Icache miss results in a refill buffer hit, prefetching is not started until all the data has been moved from the refill buffer entry into the pipeline.
  • Page 36: Effect Of Branching Instructions On The Branch-Prediction Stack

    21164PC Microarchitecture on the top two count values and is predicted not-taken on the bottom two count val- ues. The history status is not initialized on Icache fill, therefore it may “remember” a branch that was evicted from the Icache and subsequently reloaded. The 21164PC does not limit the number of branch predictions outstanding to one.
  • Page 37: Instruction Translation Buffer

    21164PC Microarchitecture The RET, JSR_COROUTINE, and HW_REI instructions predict the next PC by using the index from the subroutine return stack. The upper bits of the PC are formed from the data in the Icache tag at that index. These predictions are checked against the actual PC in exactly the same way that JMP and JSR predictions are checked.
  • Page 38: Interrupts

    21164PC Microarchitecture • One superpage maps virtual address bits <39:13> to physical address bits <39:13>, on a one-to-one basis, when virtual address bits <42:41> equal 2. This maps the entire physical address space four times over to the quadrant of the vir- tual address space.
  • Page 39: Integer Execution Unit

    21164PC Microarchitecture Each interrupt source, or group of sources, is assigned an interrupt priority level (IPL), as shown in Table 4–11. The current IPL is set using the IPLR register (see Section 5.1.18). Any interrupts that have an equal or lower IPL are masked. When an interrupt occurs that has an IPL greater than the value in the IPLR register, program control passes to the INTERRUPT PALcode entry point.
  • Page 40 21164PC Microarchitecture The floating-point divide unit is associated with the floating-point add pipeline but is not pipelined. The FPU can accept two instructions every cycle, with the exception of floating- point divide instructions. The result latency for nondivide, floating-point instructions is four cycles.
  • Page 41 21164PC Microarchitecture The DTB also supports the optional superpage extensions that are enabled using ICSR<SPE>. The DTB superpage maps provide virtual-to-physical address transla- tion for two regions of the virtual address space, as described in Section 2.1.1.4. PALcode fills and maintains the DTB. The operating system, using PALcode, must ensure that virtual addresses be mapped either through a single DTB entry or through superpage mapping.
  • Page 42 21164PC Microarchitecture A load instruction that is issued one cycle after a store instruction in the pipeline cre- ates a conflict if both the load and store operations access the same memory location. (The store instruction has not yet updated the location when the load instruction reads it.) This conflict is handled by forcing the load instruction to take a replay trap;...
  • Page 43 Pipeline Organization 2.1.6.1 Data Cache The data cache (Dcache) is a dual-read-ported, single-write-ported, 8KB cache. It is a write-through, read-allocate, direct-mapped, byte-accessible, physical cache with 32-byte blocks and data parity at the byte level. 2.1.6.2 Instruction Cache The instruction cache (Icache) is a 16KB, virtual, direct-mapped cache with 64-byte blocks and 32-byte fills.
  • Page 44: Instruction Pipeline Stages

    Pipeline Organization Figure 2–2 Instruction Pipeline Stages Instruction Cache Read Instruction Buffer, Branch Decode, Determine Next PC Slot by Function Unit Register File Access Checks, Integer Register File Access Integer Operate Arithmetic, logical, shift, and compare Pipeline instructions complete in pipeline stage 4 (1-cycle latency).
  • Page 45: Pipeline Examples-All Cases

    Pipeline Organization Table 2–2 Pipeline Examples—All Cases Pipeline Stage Events Access Icache tag and data. Buffer four instructions, check for branches, calculate branch displace- ments, and check for Icache hit. Slot-swap instructions around so they are headed for pipelines capable of executing them.
  • Page 46: Pipeline Examples-Load (Dcache Hit)

    Pipeline Organization Table 2–5 Pipeline Examples—Load (Dcache Hit) Pipeline Stage Events Calculate the effective address. Begin the Dcache data and tag store access. Finish the Dcache data and tag store access. Detect Dcache hit. Format the data as required. Bcache arbitration defaults to pipe E0 in anticipation of a possible miss.
  • Page 47: Pipeline Examples-Store (Dcache Hit)

    Pipeline Organization Table 2–7 Pipeline Examples—Store (Dcache Hit) Pipeline Stage Events Calculate the effective address. Begin the Dcache tag store access. Finish the Dcache tag store access. Detect Dcache hit. Send store to the write buffer simultaneously. Write the Dcache data store if hit (write begins this cycle). 2.2.1 Pipeline Stages and Instruction Issue The 21164PC pipeline divides instruction processing into four static and a number of dynamic stages of execution.
  • Page 48 Pipeline Organization The nonexception case does not need to drain the pipeline of all outstanding instruc- tions ahead of the aborting instruction. The pipeline can be restarted immediately at a redirected address. Examples of nonexception abort conditions are branch mispre- dictions, subroutine call/return mispredictions, and replay traps.
  • Page 49: Instruction Classes And Slotting

    Scheduling and Issuing Rules 2.2.3 Nonissue Conditions There are two reasons for nonissue conditions. The first is a pipeline stall wherein a valid instruction or set of instructions are prepared to issue but cannot due to a resource conflict (register conflict or function unit conflict). These types of nonissue cycles can be minimized through code scheduling.
  • Page 50 Scheduling and Issuing Rules Table 2–8 Instruction Classes and Slotting (Sheet 2 of 3) Class Name Pipeline Instruction List MXPR E0 or E1 HW_MFPR, HW_MTPR (depends on the IPR) Integer conditional branches Floating-point conditional branches Jump-to-subroutine instructions: JMP, JSR, RET, or JSR_COROUTINE, BSR, BR, HW_REI, CALLPAL IADD E0 or E1...
  • Page 51 Scheduling and Issuing Rules Table 2–8 Instruction Classes and Slotting (Sheet 3 of 3) Class Name Pipeline Instruction List FCPYS FM or FA CPYS, not including CPYSN or CPYSE MISC RPCC, TRAPB UNOP None UNOP IEU pipeline 0. IEU pipeline 1. FEU add pipeline.
  • Page 52: Coding Guidelines

    Scheduling and Issuing Rules • An instruction of class LD cannot be issued simultaneously with an instruction of class ST. • All instructions are discarded at the slotting stage after a predicted-taken IBR or FBR class instruction, or a JSR class instruction. •...
  • Page 53: Instruction Latencies

    Scheduling and Issuing Rules Instructions [a] (the LDL) and [b] (the first ADDL) in the following example are slotted together. Instruction [b] stalls (split-issue), thus preventing instruction [c] from advancing to the issue stage: Code example showing Code example showing incorrect ordering correct ordering (1) [a] LDL...
  • Page 54: Instruction Latencies

    Scheduling and Issuing Rules Table 2–9 Instruction Latencies (Sheet 1 of 2) Additional Time Before Result Available to Class Latency Integer Multiply Unit Dcache hits, latency=2. 1 cycle Dcache miss/Bcache hit, latency=10 or longer. Store operations produce no result. — LDx_L Dcache hits, latency=2.
  • Page 55 Scheduling and Issuing Rules Table 2–9 Instruction Latencies (Sheet 2 of 2) Additional Time Before Result Available to Class Latency Integer Multiply Unit IMULQ Latency=12, plus up to 2 cycles of added latency, depending on 1 cycle the source of the data. Latency until next IMULL, IMULQ, or IMULH instruction can issue (if there are no data dependencies) is 8 cycles plus the number of cycles added to the latency.
  • Page 56: Producer–Producer Latency

    Scheduling and Issuing Rules 2.3.3.1 Producer–Producer Latency Producer–producer latency, also known as write-after-write conflicts, cause issue- stalls to preserve write order. If two instructions write the same register, they are forced to do so in different cycles by the IDU. This is necessary to ensure that the correct result is left in the register file after both instructions have executed.
  • Page 57: Issue Rules

    Scheduling and Issuing Rules 2.3.4 Issue Rules The following is a list of conditions that prevent the 21164PC from issuing an instruction: • No instruction can be issued until all of its source and destination registers are clean; that is, all outstanding write operations to the destination register are guar- anteed to complete in issue order and there are no outstanding write operations to the source registers, or those write operations can be bypassed.
  • Page 58: Replay Traps

    Replay Traps • No instruction can be issued to pipe E0 or E1 exactly two cycles before an inte- ger register fill is requested (speculatively) by the CBU, except IMULL, IMULQ, and IMULH instructions and instructions that do not produce any result.
  • Page 59: Miss Address File And Load-Merging Rules

    Miss Address File and Load-Merging Rules • Load-after-store trap: A replay trap occurs if a load instruction is issued in the cycle immediately following a store instruction that hits in the Dcache, and both access the same location. The address match is exact for address bits <12:2> (longword granularity), but ignores address bits <42:13>.
  • Page 60: Noncacheable Space Load-Merge Rules

    Miss Address File and Load-Merging Rules • Merging is prevented for the MAF entry after the first data fill (to that MAF entry) from the Bcache, regardless of whether the Bcache access hits or not. • Load misses that match any MAF address down to the INT32 boundary, but could not merge (for any reason), are replay trapped.
  • Page 61: Maf Entries And Maf Full Conditions

    Miss Address File and Load-Merging Rules A bypass is provided so that if the load instruction issues in IEU pipe E0, and no MAF requests are pending, the load instruction’s read request is sent to the CBU immediately, provided the CBU is ready for such an access. Similarly, if a load instruction from IEU pipe E1 misses, and there was no load instruction in pipe E0 to begin with, the E1 load miss is sent to the CBU immediately.
  • Page 62: Mtu Store Instruction Execution

    MTU Store Instruction Execution Up to two floating or integer registers may be written for each CBU fill cycle. Fills deliver 32 bytes in two cycles: two INT8s per cycle. The MAF merging rules ensure that there is no more than one register to write for each INT8, so that there is a regis- ter file write port available for each INT8.
  • Page 63: Write Buffer And The Wmb Instruction

    Write Buffer and the WMB Instruction A load instruction that is issued one cycle after a store instruction in the pipeline cre- ates a conflict if both access exactly the same memory location. This occurs because the store instruction has not yet updated the location when the load instruction reads it.
  • Page 64: The Write Buffer

    Write Buffer and the WMB Instruction 2.7.1 The Write Buffer The write buffer contains six fully associative 32-byte entries. The purpose of the write buffer is to minimize the number of CPU stall cycles by providing a finite, high-bandwidth resource for receiving store data. This is required because the 21164PC can generate store data at the peak rate of one INT8 every CPU cycle.
  • Page 65: Write Buffer Entry Processing

    Write Buffer and the WMB Instruction Each time the write buffer is presented with a store instruction, the physical address generated by the instruction is compared to the address in each valid write buffer entry that is open for merging. If the address is in the same INT32 as an address in a valid write buffer entry (that also contains a store instruction), and the entry is open for merging, then the new store data is merged into that entry and the entry’s byte mask bits are updated.
  • Page 66: Ordering Of Noncacheable Space Write Instructions

    Performance Measurement Support–Performance Counters • The number of entries in the write buffer exceeds the number programmed in MAF_MODE<WB_CLR_LO_THRESH>. This ensures that these instructions complete as quickly as possible. The MTU requests that a write buffer entry be processed every 264 cycles (provided there is a valid entry in the write buffer), even if the write buffer is not arbitrating.
  • Page 67: Cbu Performance Counters

    Performance Measurement Support–Performance Counters therefore, the exception PC might not reflect the exact instruction causing counter overflow. Three counters are provided to allow accurate comparison of two variables under a potentially nonrepeatable experimental condition. The three counters are designated counter 0 (16 bits), counter 1 (16 bits), and counter 2 (14 bits). Counter inputs include: •...
  • Page 68 Performance Measurement Support–Performance Counters Read and write requests can be to either cacheable or I/O space addresses, but the CBU performance counters only count requests to cacheable address space. The total number of read requests is equal to the sum of the Dstream read requests and the Istream read requests.
  • Page 69 Performance Measurement Support–Performance Counters Misses in the onchip caches can merge in the MTU before being issued to the CBU. Therefore, MTU read or write requests are not the same as onchip cache misses. Also, two Bcache misses can merge in the CBU and appear on the system bus as a single READ MISS request.
  • Page 70: Floating-Point Control Register (Fpcr) Format

    Floating-Point Control Register 2.9 Floating-Point Control Register Figure 2–3 shows the format of the floating-point control register (FPCR) and Table 2–10 describes the fields. Figure 2–3 Floating-Point Control Register (FPCR) Format RAZ/IGN RAZ/IGN INVD DZED OVFD DYN_RM UNDZ UNFD INED LJ-05358.AI4 Table 2–10 Floating-Point Control Register Bit Descriptions (Sheet 1 of 2)
  • Page 71 Floating-Point Control Register Table 2–10 Floating-Point Control Register Bit Descriptions (Sheet 2 of 2) Name Extent Description (Meaning When Set) DYN_RM <59:58> Dynamic routing mode. Indicates the rounding mode to be used by an IEEE floating-point operate instruction when the instruction’s function field specifies dynamic mode (/D).
  • Page 72: Typical Uniprocessor Configuration

    Design Examples 2.10 Design Examples The 21164PC can be designed into many different uniprocessor system configura- tions. Figure 2–4 illustrates one possible configuration. This configuration employs additional system/memory controller chipsets. Figure 2–4 shows a typical uniprocessor system with a board-level cache. This sys- tem configuration could be used in standalone or networked workstations.
  • Page 73 Hardware Interface This chapter contains the 21164PC microprocessor logic symbol and provides a list of signal names and their functions. 3.1 21164PC Microprocessor Logic Symbol Figure 3–1 shows the logic symbol for the 21164PC chip. 29 September 1997 – Subject To Change Hardware Interface 3–1...
  • Page 74: Clocks

    21164PC Microprocessor Logic Symbol Figure 3–1 21164PC Microprocessor Logic Symbol 21164PC addr_bus_req_h addr_h<39:4> cack_h System/Bcache addr_res_h<1:0> dack_h cmd_h<3:0> Interface data_bus_req_h data_h<127:0> fill_h fill_dirty_h data_adsc_l data_adv_l fill_error_h data_ram_oe_l fill_id_h data_ram_we_l<3:0> idle_bc_h index_h<21:4> int4_valid_h<3:0> lw_parity_h<3:0> st_clk1_h st_clk2_h st_clk3_h tag_data_h<32:19> tag_data_par_h tag_dirty_h tag_ram_oe_l tag_ram_we_l tag_valid_h victim_pending_h...
  • Page 75 21164PC Signal Names and Functions 3.2 21164PC Signal Names and Functions The 21164PC is contained in a 413-pin interstitial pin grid array (IPGA) package. There are 264 functional signal pins, 2 spare signal pins (unused), 5 voltage refer- ence pins (unused), 46 external power (Vdd) pins, 22 internal power (Vddi) pins, and 74 ground (Vss) pins.
  • Page 76: 21164Pc Signal Descriptions

    21164PC Signal Names and Functions The remaining two tables describe the function of each 21164PC external signal. Table 3–1 lists all signals in alphanumeric order. This table provides full signal descriptions. Table 3–2 lists signals by function and provides an abbreviated descrip- tion.
  • Page 77 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 2 of 10) Signal Type Count Description clk_mode_h<1:0> Clock test mode. These signals specify a relationship between osc_clk_in_h,l, the CPU cycle time, and the duty-cycle equal- izer. These signals should be deasserted in normal operation mode.
  • Page 78 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 3 of 10) Signal Type Count Description cmd_h<3:0> Command bus. These signals drive and receive the commands from the command bus. The following tables define the com- mands that can be driven on the cmd_h<3:0> bus by the 21164PC or the system.
  • Page 79 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 4 of 10) Signal Type Count Description System Commands to 21164PC: cmd_h <3:0> Command Meaning 0000 Nothing. 0001 FLUSH Removes block from caches; return dirty data. 0010 INVALIDATE Invalidates the block from caches.
  • Page 80 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 5 of 10) Signal Type Count Description data_ram_we_l<3:0> Data RAM write-enable. These signals are asserted for any Bcache write operation. Refer to Section 5.3.1 for timing details. dc_ok_h dc voltage OK. Must be deasserted until dc voltage reaches proper operating level.
  • Page 81 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 6 of 10) Signal Type Count Description int4_valid_h<3:0> INT4 data valid. During write operations to noncached space, these signals are used to indicate which INT4 bytes of data are valid.
  • Page 82 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 7 of 10) Signal Type Count Description When addr_h<39> is asserted, the int4_valid_h<3:0> signals are considered the addr_h<3:0> bits required for byte/word transactions. The functionality of these bits is tied to the value stored in addr_h<38:37>.
  • Page 83 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 8 of 10) Signal Type Count Description irq_h<3:0> System interrupt requests. These signals have multiple modes of operation. During normal operation, these level-sensitive signals are used to signal interrupt requests. During initializa- tion, these signals are used to set up the CPU cycle time divi- sor for sys_clk_out1_h as follows: irq_h<3>...
  • Page 84 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 9 of 10) Signal Type Count Description pwr_fail_irq_h Power failure interrupt request. This signal has multiple modes of operation. During initialization, this signal is used to set up sys_clk_out2_ h delay (see Table 4–3). During normal opera- tion, this signal is used to signal a power failure.
  • Page 85 21164PC Signal Names and Functions Table 3–1 21164PC Signal Descriptions (Sheet 10 of 10) Signal Type Count Description tag_data_h<32:19> Bcache tag data bits. This bit range supports .5MB to 4MB Bcaches. tag_data_par_h Tag data parity bit. This signal indicates odd parity for tag_data_h<32:19>.
  • Page 86: 21164Pc Signal Descriptions By Function

    21164PC Signal Names and Functions Table 3–2 lists signals by function and provides an abbreviated description. Table 3–2 21164PC Signal Descriptions by Function (Sheet 1 of 3) Signal Type Count Description Clocks clk_mode_h<1:0> Clock test mode. cpu_clk_out_h CPU clock output. osc_clk_in_h,l Oscillator clock inputs.
  • Page 87 21164PC Signal Names and Functions Table 3–2 21164PC Signal Descriptions by Function (Sheet 2 of 3) Signal Type Count Description System Interface addr_h<39:4> Address bus. addr_bus_req_h Address bus request. addr_res_h<1:0> Address response. cack_h Command acknowledge. cmd_h<3:0> Command bus. dack_h Data acknowledge. data_bus_req_h Data bus request.
  • Page 88 21164PC Signal Names and Functions Table 3–2 21164PC Signal Descriptions by Function (Sheet 3 of 3) Signal Type Count Description srom_oe_l Serial ROM output enable. srom_present_l Serial ROM present. tck_h JTAG boundary-scan clock. tdi_h JTAG serial boundary-scan data in. tdo_h JTAG serial boundary-scan data out.
  • Page 89 Clocks, Cache, and External Interface This chapter describes the 21164PC microprocessor external interface, which includes the backup cache (Bcache) and system interfaces. It also describes the clock circuitry, interrupt signals, and parity generation. It is organized as follows: • Introduction to the external interface •...
  • Page 90: Introduction To The External Interface

    Introduction to the External Interface 4.1 Introduction to the External Interface A 21164PC-based system can be divided into three major sections: • 21164PC microprocessor • External Bcache • System interface logic The 21164PC external interface is optimized for uniprocessor-based systems and mandates few design rules.
  • Page 91: 21164Pc System/Bcache Interface

    Introduction to the External Interface Figure 4–1 21164PC System/Bcache Interface System Memory 21164PC addr_h<39:4> and I/O addr_bus_req_h addr_res_h<1:0> cack_h cmd_h<3:0> dack_h data_bus_req_h System Interface fill_h fill_error_h fill_id_h idle_bc_h Dcache int4_valid_h<3:0> Victim Buffers victim_pending_h data_h<127:0> Miss index_h<21:4> Optional Bcache State Data SRAM SRAM Bcache...
  • Page 92: Bcache Interface

    Introduction to the External Interface The BIU contains a three-entry BIU command/address buffer (BAF) capable of queueing up to three Bcache misses or I/O references. These buffers are capable of merging both read and write miss references, to reduce external system bus traffic. 4.1.2 Bcache Interface The 21164PC includes an interface and control for a required backup cache (Bcache).
  • Page 93: Write Interleaving

    Introduction to the External Interface Figure 4–2 Merits of a Multiprobes In Flight – Pipelined Cache Pipelining allows 100% utilization of the data bus. Nonpipelined Cache: index latency 1 latency 2 data D10 D11 D20 D21 Pipelined Cache: index latency 1 latency 2 latency 3 data...
  • Page 94: Clocks

    Clocks Figure 4–3 Tag/Data Store Interleaving Interleaving tag write probes with data write operations allows 100% utilization of the data bus. Data writes interleaved with tag probes index latency 1 latency 2 latency 3 latency 4 latency 5 Hit 1 Hit 2 Hit 3 Hit 4...
  • Page 95: Cpu Clock

    Clocks 4.2.1 CPU Clock The 21164PC uses the differential input clock lines osc_clk_ in_h,l as a source to generate its CPU clock. The input signals clk_mode_h<1:0> control generation of the CPU clock, as listed in Table 4–1 and as shown in Figure 4–4. The 21164PC uses clk_mode_h<0>...
  • Page 96: System Clock

    Clocks Figure 4–4 Clock Signals and Functions 21164PC osc_clk_in_h, l CPU Clock cpu_clk_out_h Divider Symmetrator clk_mode_h<1:0> (/1 or /4) System Clock sys_clk_out1_h Divider irq_h<3:0> (/4 through /15) mch_hlt_irq_h System Clock sys_clk_out2_h pwr_fail_irq_h Delay sys_mch_chk_irq_h (0 through 7) sys_reset_l dc_ok_h MK5502B 4.2.2 System Clock The CPU clock is the source clock used to generate the system clock sys_clk_out1_h.
  • Page 97: 21164Pc Uniprocessor Clock

    Clocks Table 4–2 System Clock Divisor (Sheet 2 of 2) irq_h<3> irq_h<2> irq_h<1> irq_h<0> Ratio High High High High High High High High High High High High High High High High High High High Figure 4–5 shows the 21164PC driving the system clock on a uniprocessor system. Figure 4–5 21164PC Uniprocessor Clock Memory ASIC...
  • Page 98: Physical Address Considerations

    Physical Address Considerations even. The output is asymmetric if the divisor is odd. When the divisor is odd, the clock is high for an extra cycle. Refer to Section 7.2 for information on sysclk behavior during reset. Table 4–3 System Clock Delay sys_mch_chk_irq_h pwr_fail_irq_h mch_hlt_irq_h Delay Cycles...
  • Page 99: Data Wrapping

    Physical Address Considerations system environment as to which INT8s are accessed. Write merging is permitted. Write accesses are INT32 requests with a mask indicating which INT4s are actually modified. The 21164PC never writes more than 32 bytes at a time in noncached space. The 21164PC does not broadcast accesses to the CBU IPR region if they map to a CBU IPR.
  • Page 100: Noncached Read Operations

    Bcache Structure 4.3.3 Noncached Read Operations Read operations to physical addresses that have addr_h<39> asserted are not cached in the Dcache or Bcache. They are merged like other read operations in the miss address file (MAF). To prevent several read operations to noncached memory from being merged into a single 32-byte bus request, software must insert memory barrier (MB) instructions or set MAF_MODE IPR bit (IO_NMERGE).
  • Page 101: Bcache Victim Buffers

    Cache Coherency The 21164PC partitions physical address (addr_h<32:4>) into an index field and a tag field. The 21164PC presents index_ h<21:4> and tag_data_h<32:19> to the Bcache interface. The tag size required is Bcache_size/block_size. The system designer uses the signal lines needed for a particular size Bcache. For example, the 1MB Bcache needs index_h<19:4>...
  • Page 102: Flush Cache Coherency Protocol

    Cache Coherency The system hardware designer need not be concerned about Icache and Dcache coherency. Coherency of the Icache is a software concern—it is flushed with an IMB (PALcode) instruction. The 21164PC requires the system to allow only one change to a block at a time. This means that if the 21164PC gains the bus to read or write a block, I/O devices on the system bus should not be allowed to access that block until the data has been moved.
  • Page 103: Flush-Based Protocol 21164Pc States

    Cache Coherency System logic notifies the 21164PC of all DMA write operations that occur on the system bus by using the interface FLUSH command. If the block is dirty, the 21164PC provides the data to the system and invalidates the block in the Bcache. If the block is not dirty (clean), data is not returned, and the block is invalidated.
  • Page 104: 21164Pc-To-Bcache Transactions

    21164PC-to-Bcache Transactions Figure 4–7 Flush-Based Protocol System/Bus States FLUSH FLUSH (DMA Write Operation) (DMA Write Operation) No Data Returned Data Returned to System to System INVAL Data Returned to System INVAL No Data Returned to System READ READ (DMA Read Operation) (DMA Read Operation) PCA017 4.6 21164PC-to-Bcache Transactions...
  • Page 105: Ssram/Bcache Interface

    21164PC-to-Bcache Transactions Figure 4–8 SSRAM/Bcache Interface 21164PC index_h<21:4> tag_ram_we_l A<X:0> GW_L ADSC_L ADSP_L ADV_L BWE_L<3:0> tag_ram_oe_l Store OE_L MODE CE_L st_clkx_h DATA tag_data_h<32:19> A<X:0> data_ram_we_l<3> GW_L A<X:0> data_ram_we_l<2> GW_L A<X:0> data_ram_we_l<1> GW_L A<X:0> data_ram_we_l<0> GW_L data_adsc_l ADSC_L ADSP_L data_adv_l ADV_L DATA BWE_L<3:0>...
  • Page 106: Bcache Timing

    21164PC-to-Bcache Transactions For every Bcache access, the 21164PC drives the index, address strobe (data_adsc_l), and the SSRAM clock (st_clk) to the SSRAMs to load the initial address. The st_clk may be delayed a programmable number of CPU cycles to facil- itate better control over module timing.
  • Page 107 21164PC-to-Bcache Transactions Bcache timing is configured using the CBOX_CONFIG and CBOX_CONFIG2 IPRs. Figures 5–48 and 5–51 show the layout of these registers. These registers are normally configured by 21164PC initialization code. Both the 21164PC and system require access to the Bcache through a shared 128-bit data bus.
  • Page 108: Bcache Private Read Transaction

    21164PC-to-Bcache Transactions latency and repetition rate are programmable using the CBOX_CONFIG register fields <11:08> (BC_LATENCY_OFF<3:0>) and <07:04> (BC_CLK_RATIO<3:0>). For private Bcache writes, the 21164PC uses the early write SSRAM protocol controlled by the ADSC# pin. The repetition rate for data writes is programmable through the (BC_CLK_RATIO<3:0>).
  • Page 109: Bcache Private Read Transaction

    21164PC-to-Bcache Transactions Figure 4–9 Bcache Private Read Transaction Arrows indicate when the 21164PC clocks Bcache data into the pad ring. cpu_clk bc_clk_delay (=1) bc_clk_ratio (=3) st_clk x _h index_h<21:4> bc_rd_latency (=5) tag_ram_oe_l tag_data<32:19> tag_ram_we_l data_adsc_l data_adv_l bc_rd_latency+bc_clk_ratio (=8) data_ram_oe_l data<127:0> data_ram_we_l<3:0>...
  • Page 110: Bcache Private Write Transactions

    21164PC-to-Bcache Transactions 4.6.5 Bcache Private Write Transactions CPU-initiated write operations are broken into two suboperations, namely a write- probe operation and a subsequent data-write operation. The write-probe operation performs the tag store lookup to determine hit or miss status as well as to determine the tag state, clean (V */D) or dirty (V*D).
  • Page 111: Bcache Private Data-Write Operation

    21164PC-to-Bcache Transactions Figure 4–10 Bcache Private Write Probe Arrow indicates when the 21164PC clocks Bcache probe data. cpu_clk bc_clk_delay (=1) bc_clk_ratio (=3) st_clk x _h index_h<21:4> bc_rd_latency (=5) tag_ram_oe_l tag_data<32:19> tag_ram_we_l data_adsc_l data_adv_l data_ram_oe_l data<127:0> data_ram_we_l<3:0> FM-05561.AI4 4.6.5.2 Bcache Private Data-Write Operation If a CPU-initiated write command hits in the Bcache, the data-write operation is scheduled immediately.
  • Page 112: Bcache Private Data - Write Hit Clean

    21164PC-to-Bcache Transactions fore, the data-write operation after the fill operation completes does not update the tag store. The Bcache is nonblocking and allows other transactions to use the Bcache while waiting for outstanding Bcache misses. Figure 4–11 shows an example of the timing for a data-write operation that hits clean to the Bcache during the write probe.
  • Page 113: Bcache Private Data - Write Hit Dirty

    21164PC-to-Bcache Transactions data_adv_l one bc_clk cycle after the launch of the index. It is deasserted in the fol- lowing bc_clk cycle. The longword write enables, data_ram_we_l<3:0>, are driven for each 16-byte of write data at index launch time and at the subsequent bc_clk cycle.
  • Page 114: Interleaving Write-Probes

    21164PC-to-Bcache Transactions 4.6.5.3 Interleaving Write-Probes The 21164PC is able to interleave data-write operations that hit dirty with write- probe operations, since both operations access different stores (tag and data). This technique is used to fully saturate the data bus during write-hit streams as is shown in Figure 4–13.
  • Page 115: Selecting Bcache Options

    21164PC-Initiated System Transactions 4.6.6 Selecting Bcache Options Table 4–7 lists the variables to consider when designing and implementing a Bcache. Table 4–7 Bcache Options Parameter Selection sysclk ratio (4-15) ____ CPU cycles Cache protocol, flush or flush invalidate ____ Longword parity or no parity ____ Bcache size (.5MB to 4MB) ____ MB...
  • Page 116: 21164Pc-Initiated Interface Commands

    21164PC-Initiated System Transactions • If there is a tag mismatch or the valid bit is clear, a Bcache miss has been detected. If the block to be replaced is clean, the Bcache continues operation while the READ MISS request is sent to the system. If the block to be replaced is dirty, the 21164PC waits for all outstanding probes in flight to complete, and then starts an external READ MISS with VICTIM PENDING transaction that instructs the system logic to access and return data.
  • Page 117 21164PC-Initiated System Transactions Table 4–8 21164PC-Initiated Interface Commands (Sheet 2 of 2) cmd_h Command <3:0> Description WRITE BLOCK 0110 Request to write a block. When the 21164PC wants to write a 32-byte block of data to noncached memory, it drives the com- mand, address, and first INT16 of data on a sysclk edge.
  • Page 118: Read Miss Clean - No Victim

    21164PC-Initiated System Transactions 4.7.1 READ MISS Clean - No Victim A READ MISS command is launched to the system interface when: 1. The Bcache probe for a CPU-initiated READ command detects a miss. 2. The Bcache probe for a CPU-initiated WRITE command detects a miss. 3.
  • Page 119: Read Miss Clean - Bcache Timing Diagram

    21164PC-Initiated System Transactions Figure 4–14 READ MISS Clean – Bcache Timing Diagram 29 September 1997 – Subject To Change Clocks, Cache, and External Interface 4–31...
  • Page 120: Fill

    21164PC-Initiated System Transactions 4.7.2 FILL The 21164PC provides an st_clkx_h pulse a certain number of cycles after the rising edge of the system clock, determined by the sum of the BC_CLK_DELAY<1:0> and the FILL_OFFSET<2:0> values in the CBOX_CONFIG register (see Section 5.3.1). The value must be from 1 to 7 and cannot be greater than the sysclk ratio.
  • Page 121: Read Miss With Victim

    21164PC-Initiated System Transactions At the end of the fill transaction, the 21164PC does not assert data_ram_oe_l or begin to drive the data bus until the fifth cpu_clk cycle after the sysclk that loads the last dack_h. If systems require more time to turn off their drivers, they must use idle_bc_h in combination with data_bus_req_h to stop 21164PC requests and not send any system requests.
  • Page 122 21164PC-Initiated System Transactions The use of dack_h for a system Bcache read command (Bcache victim or system command with data movement) is very dependent on the SSRAM style, either pipe- lined or flow-through. The assertion of dack_h is responsible for the assertion of the data_adv_l pin, and is not to be confused with the sampling of data.
  • Page 123: Read Miss With Victim Timing Diagram, Pipelined Mode

    21164PC-Initiated System Transactions Figure 4–15 READ MISS with Victim Timing Diagram, Pipelined Mode 29 September 1997 – Subject To Change Clocks, Cache, and External Interface 4–35...
  • Page 124: Read Miss With Victim Timing Diagram, Flow-Through Mode

    21164PC-Initiated System Transactions Figure 4–16 READ MISS with Victim Timing Diagram, Flow-Through Mode Clocks, Cache, and External Interface 29 September 1997 – Subject To Change 4–36...
  • Page 125: Write Block

    21164PC-Initiated System Transactions 4.7.4 WRITE BLOCK The WRITE BLOCK command is used to complete write operations to noncached memory. The 21164PC asserts the WRITE BLOCK command, along with the address at the start of a sysclk cycle. The first 16 bytes of data and the int4_valid signals are driven one cpu_clk cycle later, so that system interface can be assured a one cpu_clk cycle minimum hold time when sampling data on the next sysclk edge.
  • Page 126: System-Initiated Transactions

    System-Initiated Transactions Figure 4–17 WRITE BLOCK Timing Diagram sys_clk(4:1) addr_h<39:4> cmd_h<3:0> WRBLK victim_pending_h cack_h fill_h fill_id_h idle_bc_h data_h<127:0> dack_h FM-05560.AI4 4.8 System-Initiated Transactions System commands to the 21164PC are driven on the cmd_h<3:0> signal lines. Before driving these signals, the system must gain control of the command and address buses by using addr_bus_req_h, as described in Section 4.9.1.
  • Page 127: Algorithm For System Sending Commands To The 21164Pc

    System-Initiated Transactions The 21164PC can hold two outstanding commands from the system at any time. The algorithm used by the system to send commands to the 21164PC without overflow- ing the two CBU BIU command buffers is shown in Figure 4–18. Figure 4–18 Algorithm for System Sending Commands to the 21164PC Start Init?
  • Page 128: Write Invalidate Protocol Commands

    System-Initiated Transactions 4.8.2 Write Invalidate Protocol Commands All 21164PC-based systems that use the write invalidate protocol are expected to use the READ, FLUSH, and INVALIDATE commands to maintain cache coherency. These commands are defined in Table 4–9. Table 4–9 System-Initiated Interface Commands (Write Invalidate Protocol) cmd_h Command...
  • Page 129: Flush

    System-Initiated Transactions 4.8.2.1 21164PC Responses to Flush-Based Protocol Commands The system responds to flush-based protocol commands on addr_res_h<1:0>, as shown in Table 4–10. Table 4–10 21164PC Responses to Flush-Based Protocol Commands READ and FLUSH Commands Bcache 21164PC Response Bcache Miss NOACK Bcache Hit, Not Dirty NOACK...
  • Page 130: Flush Timing Diagram (Bcache Hit) Flow-Through Ssram

    System-Initiated Transactions Figure 4–19 FLUSH Timing Diagram (Bcache Hit) Flow-Through SSRAM Clocks, Cache, and External Interface 29 September 1997 – Subject To Change 4–42...
  • Page 131: Invalidate

    System-Initiated Transactions 4.8.2.3 INVALIDATE The INVALIDATE command can be used to remove a block from the cache system. Unlike the FLUSH command, any modified data will not be read. The Bcache is probed and invalidated if the block is found. Figure 4–20 shows the timing of an INVALIDATE transaction.
  • Page 132: Read Timing Diagram (Bcache Hit) Flow-Through Ssram

    System-Initiated Transactions When using the pipelined SSRAMs, the data output register delays the data an addi- tional sysclk cycle. When the CBOX_CONFIG<BC_REG_REG> bit is set, the data_ram_oe_l deassertion is delayed an additional sysclk cycle to allow the system ample time to sample the delayed Bcache read data. Figure 4–21 READ Timing Diagram (Bcache Hit) Flow-Through SSRAM Clocks, Cache, and External Interface 29 September 1997 –...
  • Page 133: Data Bus And Command/Address Bus Contention

    Data Bus and Command/Address Bus Contention 4.9 Data Bus and Command/Address Bus Contention The data bus is composed of data_h<127:0> and lw_parity_h<3:0>. The com- mand/address bus is composed of cmd_h<3:0> and addr_h<39:4>. The following sections describe situations that have contention for use of the data bus or contention for use of the command/address bus.
  • Page 134: Read/Write Spacing-Data Bus Contention

    Data Bus and Command/Address Bus Contention 4.9.2 Read/Write Spacing—Data Bus Contention The data bus, data_h<127:0>, can be driven by the 21164PC, the Bcache array, or the system. In the case of private Bcache write operations followed by private Bcache read oper- ations, the 21164PC stops driving the data bus well in advance of the Bcache turning For private Bcache read operations followed by private Bcache write operations, the 21164PC inserts a programmable number of cpu_clk cycles between the read and...
  • Page 135: Using Data_Bus_Req_H

    Data Bus and Command/Address Bus Contention For example, if the sysclk ratio is 7, the Bcache read latency is 5, the bc_clk ratio is 3, and two cycles are necessary for tristate turnoff, then the equations would work out to: cpu_clk sysclk ratio sys_clk...
  • Page 136: Tristate Overlap

    Data Bus and Command/Address Bus Contention To gain control of the data bus, the system must ensure that the Bcache is idle by asserting idle_bc_h for the required time. It can then assert data_bus_req_h. If data_bus_req_h is received asserted at the rising edge of sysclk N, the 21164PC stops driving the bus on the rising edge of sysclk N+1.
  • Page 137: System Read To Fill Spacing

    Data Bus and Command/Address Bus Contention 4.9.5.2 System READ to FILL (System WRITE) Spacing The time to turn off the Bcache drivers at the end of a system READ (Bcache victim or system command with data movement) is fixed by the 21164PC design (refer to Figure 4–24).
  • Page 138: Fill To Private Read Or Write Operation

    21164PC Interface Restrictions 4.9.5.3 FILL to Private READ or WRITE Operation At the end of the fill, the 21164PC does not begin to drive the data bus until the fifth cpu_clk cycle after the sysclk that loads the last dack_h (refer to Figure 4–25). The 21164PC does not assert data_ram_oe_l until the fifth cycle after the sysclk that loads the last dack_h.
  • Page 139: Command Acknowledge For Write Block Commands

    21164PC/System Race Conditions For a WRITE BLOCK operation followed by a fill operation, the earliest point the system can assert the fill_h signal is at the sysclk after the last assertion of dack_h. Fill operations followed by fill operations are special cases. Fill operations can be pipelined back-to-back so that 100% of the data bus bandwidth can be used.
  • Page 140 21164PC/System Race Conditions 6. There is one exception to rules 3, 4, and 5. If idle_bc_h or a system command arrives while the 21164PC is reading the Bcache, and that read transaction turns into a read miss transaction, and it does not produce a victim, then the 21164PC loads the miss into the pad ring.
  • Page 141: Read Miss With Victim Aborted By Fill Example

    21164PC/System Race Conditions 4.11.2 READ MISS with Victim Aborted by FILL Example In Figure 4–26, the 21164PC asserts a READ MISS command with a victim. The system asserts dack_h for two data cycles received from the Bcache and then asserts idle_bc_h.
  • Page 142: Idle_Bc_H And Cack_H Race Examples

    21164PC/System Race Conditions 4.11.3 idle_bc_h and cack_h Race Example In Figure 4–27, idle_bc_h and cack_h are asserted in the same sysclk cycle. The system takes the READ MISS and BCACHE VICTIM commands before doing any- thing else. The last dack_h meets the requirement that the cack_h arrive before or with the last dack_h.
  • Page 143: Read Miss With Idle_Bc_H Asserted Example

    21164PC/System Race Conditions 4.11.4 READ MISS with idle_bc_h Asserted Example In Figure 4–28, the 21164PC has started a Bcache read operation that misses. The signal idle_bc_h is asserted, but no victim was created, so the read miss request is loaded into the pad ring. The system then takes the request. Figure 4–28 READ MISS with idle_bc_h Asserted Example sys_clk_out1_h READ MISS...
  • Page 144: Read Miss With Victim Abort Example

    21164PC/System Race Conditions 4.11.5 READ MISS with Victim Aborted by System Command Example In Figure 4–29, the 21164PC produces a READ MISS command with a victim and is waiting for the system to take it when the system takes the bus and requests a flush transaction.
  • Page 145: Bcache Hit Under Read Miss Example

    Data Integrity and Bcache Errors 4.11.6 Bcache Hit Under READ MISS Example In Figure 4–30, the 21164PC produces a read miss transaction and requests a fill from the system. A Bcache hit to index j take places while waiting for the fill. The system then returns the requested data in two bursts, asserting cack_h at the same time as the last assertion of dack_h.
  • Page 146: 21164Pc Interrupt Signals

    Interrupts 4.12.2 Bcache Tag Data Parity The signal line tag_data_par_h is used to maintain parity over tag_data_h<32:19>, tag_valid_h, and tag_dirty_h. A Bcache tag data parity error is usually not recoverable. A Bcache hit is determined based on the tag alone, not the tag parity bit. The CBU records the Bcache probe address and the tag value read from the Bcache.
  • Page 147: Interrupt Priority Level Effect

    Interrupts 4.13.1 Interrupt Signals During Initialization The 21164PC interrupt signals work in tandem with the sys_reset_l signal to set the values for clock ratios and clock delays. During initialization, the 21164PC reads system clock configuration parameters from the interrupt pins. Section 4.2.2 and Section 4.2.3 describe how the interrupt signals are used to set system clock values when the system is initialized.
  • Page 148 Interrupts Table 4–11 Interrupt Priority Level Effect (Sheet 2 of 2) Interrupt Source Target IPL Source Software Interrupt Request 15 Internal Asynchronous system trap ATR pending (for Internal current or more privileged mode) Performance counter interrupt Internal Powerfail interrupt pwr_fail_irq_h System machine check interrupt sys_mch_chk_irq_h and internal...
  • Page 149 Internal Processor Registers This chapter describes the 21164PC microprocessor internal processor registers (IPRs). It is organized as follows: • Instruction fetch/decode unit and branch unit (IDU) IPRs • Memory address translation unit (MTU) IPRs • Cache control and bus interface unit (CBU) IPRs •...
  • Page 150: Idu, Mtu, Dcache, And Paltemp Ipr Encodings

    Table 5–1 IDU, MTU, Dcache, and PALtemp IPR Encodings (Sheet 2 of 4) IPR Mnemonic Access Index IDU Slots to Pipe ITB_PTE_TEMP ITB_IA ITB_IAP ITB_IS SIRR ASTRR ASTER EXC_ADDR EXC_SUM R/W0C EXC_MASK PAL_BASE IPLR INTID IFAULT_VA_FORM IVPTBR HWINT_CLR SL_XMIT SL_RCV ICSR IC_FLUSH_CTL ICPERR_STAT...
  • Page 151 Table 5–1 IDU, MTU, Dcache, and PALtemp IPR Encodings (Sheet 3 of 4) IPR Mnemonic Access Index IDU Slots to Pipe PALtemp2 PALtemp3 PALtemp4 PALtemp5 PALtemp6 PALtemp7 PALtemp8 PALtemp9 PALtemp10 PALtemp11 PALtemp12 PALtemp13 PALtemp14 PALtemp15 PALtemp16 PALtemp17 PALtemp18 PALtemp19 PALtemp20 PALtemp21 PALtemp22 PALtemp23...
  • Page 152 Table 5–1 IDU, MTU, Dcache, and PALtemp IPR Encodings (Sheet 4 of 4) IPR Mnemonic Access Index IDU Slots to Pipe DTB_PTE DTB_PTE_TEMP MM_STAT VA_FORM MVPTBR DTB_IAP DTB_IA DTB_IS ALT_MODE CC_CTL MCSR DC_FLUSH DC_PERR_STAT R/W1C DC_TEST_CTL DC_TEST_TAG DC_TEST_TAG_TEMP R/W DC_MODE MAF_MODE 29 September 1997 –...
  • Page 153: Istream Translation Buffer Tag (Itb_Tag) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1 Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs The IDU internal processor registers (IPRs) are described in Section 5.1.1 through Section 5.1.27. 5.1.1 Istream Translation Buffer Tag (ITB_TAG) Register (101) ITB_TAG is a write-only register written by hardware on an ITBMISS/IACCVIO, with the tag field of the faulting virtual address.
  • Page 154: Instruction Translation Buffer Page Table Entry (Itb_Pte) Register Write Format

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs contents of the ITB_TAG register. The PTE field is provided by the HW_ MTPR ITB_PTE instruction. Write operations to this register use the memory format bits, as described in the Alpha AXP Architecture Reference Manual. Figure 5–2 shows the ITB_PTE register write format.
  • Page 155: Instruction Translation Buffer Address Space Number (Itb_Asn) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.3 Instruction Translation Buffer Address Space Number (ITB_ASN) Register (103) ITB_ASN is a read/write register that contains the address space number (ASN) of the current process. Figure 5–4 shows the ITB_ASN register format. Figure 5–4 Instruction Translation Buffer Address Space Number (ITB_ASN) Register RAZ/IGN...
  • Page 156: Instruction Translation Buffer Is (Itb_Is) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.6 Instruction Translation Buffer Invalidate All (ITB_IA) Register (105) ITB_IA is a write-only register. A write operation to this register invalidates all ITB entries, and resets the ITB not-last-used (NLU) pointer to its initial state. RESET PALcode must execute an HW_MTPR ITB_IA instruction in order to initialize the NLU pointer.
  • Page 157: Formatted Faulting Virtual Address (Ifault_Va_Form) Register (Nt_Mode=0)

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.8 Formatted Faulting Virtual Address (IFAULT_VA_FORM) Register (112) IFAULT_VA_FORM is a read-only register containing the formatted faulting virtual address on an ITBMISS/IACCVIO (except on IACCVIOs generated by sign-check errors). The formatted faulting address generated depends on whether NT superpage mapping is enabled through ICSR bit SPE<0>.
  • Page 158: Virtual Page Table Base (Ivptbr) Register (Nt_Mode=0)

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.9 Virtual Page Table Base (IVPTBR) Register (113) IVPTBR is a read/write register. Bits <32:30> are UNDEFINED on a read of this register in non-NT mode. Figure 5–8 shows the IVPTBR register format in non-NT mode.
  • Page 159: Icache Parity Error Status (Icperr_Stat) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.10 Icache Parity Error Status (ICPERR_STAT) Register (11A) ICPERR_STAT is a read/write register. The Icache parity error status bits may be cleared by writing a 1 to the appropriate bits. Figure 5–10 and Table 5–3 describe the ICPERR_STAT register format.
  • Page 160: Exception Address (Exc_Addr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.12 Exception Address (EXC_ADDR) Register (10B) EXC_ADDR is a read/write register used to restart the system after exceptions or interrupts. The HW_REI instruction causes a return to the instruction pointed to by the EXC_ADDR register.
  • Page 161: Exception Summary (Exc_Sum) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Figure 5–12 Exception Summary (EXC_SUM) Register RAZ/IGN RAZ/IGN RAZ/IGN LJ-03484.AI4 Table 5–4 Exception Summary Register Fields Name Extent Type Description <10> Indicates software completion possible. This bit is set after a floating-point instruction containing the /S modifier com- pletes with an arithmetic trap, and if all previous floating- point instructions that trapped since the last HW_MTPR EXC_SUM instruction also contained the /S modifier.
  • Page 162: Exception Mask (Exc_Mask) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.14 Exception Mask (EXC_MASK) Register (10D) EXC_MASK is a read/write register that records the destinations of instructions that have caused an arithmetic trap between EXC_MASK write operations. The destina- tion is recorded as a single bit mask in the 64-bit IPR representing F0–F31 and I0–I31.
  • Page 163: Pal Base Address (Pal_Base) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.15 PAL Base Address (PAL_BASE) Register (10E) PAL_BASE is a read/write register containing the base address for PALcode. The register is cleared by hardware on reset. Figure 5–14 shows the PAL_BASE register format.
  • Page 164: Idu Control And Status (Icsr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.17 IDU Control and Status (ICSR) Register (118) ICSR is a read/write register containing IDU-related control and status information. Figure 5–16 and Table 5–5 describe the ICSR register format. Figure 5–16 IDU Control and Status (ICSR) Register RAZ/IGN RAZ/IGN PME<1:0>...
  • Page 165 Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–5 IDU Control and Status Register Fields (Sheet 2 of 3) Name Extent Type Description <19> RW,0 If set, enables the motion video instruction (MVI) set. If clear, causes any MVI class instructions to generate a RESDEC trap.
  • Page 166: Interrupt Priority Level (Iplr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–5 IDU Control and Status Register Fields (Sheet 3 of 3) Name Extent Type Description <36> RW,0 If set, forces bad Icache data parity. MBZ in nor- mal operation. <37> RW,1 Reserved to DIGITAL.
  • Page 167: Interrupt Id (Intid) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.19 Interrupt ID (INTID) Register (111) INTID is a read-only register that is written by hardware with the target IPL of the highest priority pending interrupt. The hardware recognizes an interrupt if the IPL being read is greater than the IPL given by IPLR<04:00>.
  • Page 168: Asynchronous System Trap Request (Astrr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.20 Asynchronous System Trap Request (ASTRR) Register (109) ASTRR is a read/write register containing bits to request asynchronous system trap (AST) interrupts in each of the four processor modes (U,S,E,K). In order to generate an AST interrupt, the corresponding enable bit in the ASTER must be set and the current processor mode given in the ICM<04:03>...
  • Page 169: Software Interrupt Request (Sirr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.22 Software Interrupt Request (SIRR) Register (108) SIRR is a read/write register used to control software interrupt requests. A software request for a particular IPL may be requested by setting the appropriate bit in SIRR<15:01>.
  • Page 170: Hardware Interrupt Clear (Hwint_Clr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.23 Hardware Interrupt Clear (HWINT_CLR) Register (115) HWINT_CLR is a write-only register used to clear edge-sensitive hardware interrupt requests. Figure 5–22 and Table 5–7 describe the HWINT_CLR register format. Figure 5–22 Hardware Interrupt Clear (HWINT_CLR) Register PC0C PC1C PC2C...
  • Page 171: Interrupt Summary (Isr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.24 Interrupt Summary (ISR) Register (100) ISR is a read-only register containing information about all pending hardware, soft- ware, and asynchronous system trap (AST) interrupt requests. Figure 5–23 and Table 5–8 describe the ISR register format. Refer to Table 4–11 for a description of which interrupts are enabled for a given interrupt priority level (IPL).
  • Page 172 Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–8 Interrupt Summary Register Fields (Sheet 2 of 2) Name Extent Type Description <22> External hardware interrupt—irq_h<2>. <23> External hardware interrupt—irq_h<3>. <27> External hardware interrupt—performance counter 0 (IPL 29). <28> External hardware interrupt—performance counter 1 (IPL 29).
  • Page 173: Serial Line Transmit (Sl_Xmit) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.25 Serial Line Transmit (SL_XMIT) Register (116) SL_XMIT is a write-only register used to transmit bit-serial data out of the micro- processor chip under the control of a software timing loop. The value of the TMT bit is transmitted offchip on the srom_clk_h signal.
  • Page 174: Serial Line Receive (Sl_Rcv) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.26 Serial Line Receive (SL_RCV) Register (117) SL_RCV is a read-only register used to receive bit-serial data under the control of a software timing loop. The RCV bit in the SL_RCV register is functionally connected to the srom_data_h signal.
  • Page 175: Performance Counter (Pmctr) Register

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs 5.1.27 Performance Counter (PMCTR) Register (11C) PMCTR is a read/write register that controls the three onchip performance counters. Figure 5–26 and Table 5–11 describe the PMCTR register format. Performance counter interrupt requests are summarized in Section 5.1.24. CBU inputs to the counter select options are described in the PM0_ MUX<2:0>...
  • Page 176: Performance Counter Register Fields

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–11 Performance Counter Register Fields Name Extent Type Description CTR0<15:0> <63:48> A 16-bit counter of events selected by SEL0 and enabled by CTL0<1:0>. CTR1<15:0> <47:32> A 16-bit counter. SEL0 <31> Counter0 Select—refer to Table 5–12. <30>...
  • Page 177: Pmctr Counter Select Options

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–12 shows the PMCTR counter select options. Table 5–12 PMCTR Counter Select Options (Sheet 1 of 2) Counter0 Counter1 Counter2 SEL0<0> SEL1<3:0> SEL2<3:0> 0:Cycles 0x0: nonissue cycles 0x0: long(>15 cycle) stalls Valid instruction in S3 but none issued.
  • Page 178: Measurement Mode Control

    Instruction Fetch/Decode Unit and Branch Unit (IDU) IPRs Table 5–12 PMCTR Counter Select Options (Sheet 2 of 2) Counter0 Counter1 Counter2 SEL0<0> SEL1<3:0> SEL2<3:0> 0xB: Reserved 0xC: CPU cycles 0xD: MB stall cycles 0xE: LDxL instructions issued 0xF: pick CBU<0> input 0xF: pick CBU<1>...
  • Page 179: Dstream Translation Buffer Address Space Number (Dtb_Asn) Register

    Memory Address Translation Unit (MTU) IPRs 5.2 Memory Address Translation Unit (MTU) IPRs The MTU internal processor registers (IPRs) are described in Section 5.2.1 through Section 5.2.23. 5.2.1 Dstream Translation Buffer Address Space Number (DTB_ASN) Register (200) DTB_ASN is a write-only register that must be written with an exact duplicate of the ITB_ASN register ASN field.
  • Page 180: Dstream Translation Buffer Tag (Dtb_Tag) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.3 Dstream Translation Buffer Tag (DTB_TAG) Register (202) DTB_TAG is a write-only register that writes the DTB tag and the contents of the DTB_PTE register to the DTB. To ensure the integrity of the DTBs, the DTB’s PTE array is updated simultaneously from the internal DTB_PTE register when the DTB_TAG register is written.
  • Page 181 Memory Address Translation Unit (MTU) IPRs Read operations of the DTB_PTE require two instructions. First, a read from the DTB_PTE sends the PTE data to the DTB_PTE_TEMP register. A zero value is returned to the integer register file (IRF) on a DTB_PTE read operation. A second instruction reading from the DTB_PTE_TEMP register returns the PTE entry to the register file.
  • Page 182: Dstream Translation Buffer Page Table Entry Temporary (Dtb_Pte_Temp)

    Memory Address Translation Unit (MTU) IPRs 5.2.5 Dstream Translation Buffer Page Table Entry Temporary (DTB_PTE_TEMP) Register (204) DTB_PTE_TEMP is a read-only holding register used for DTB_PTE data. Read operations of the DTB_PTE require two instructions to return the PTE data to the register file.
  • Page 183: Dstream Memory Management Fault Status (Mm_Stat) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.6 Dstream Memory Management Fault Status (MM_STAT) Register (205) MM_STAT is a read-only register that stores information on Dstream faults and Dcache parity errors. The VA, VA_FORM, and MM_STAT registers are locked against further updates until software reads the VA register. The MM_STAT bits are only modified by hardware when the register is not locked and a memory manage- ment error, DTB miss, or Dcache parity error occurs.
  • Page 184: Faulting Virtual Address (Va) Register

    Memory Address Translation Unit (MTU) IPRs Table 5–14 Dstream Memory Management Fault Status Register Fields (Sheet 2 of 2) Name Extent Type Description BAD_VA <05> Set if reference had a bad virtual address. <10:06> RA field of the faulting instruction. OPCODE <16:11>...
  • Page 185: Formatted Virtual Address (Va_Form) Register (Nt_Mode=1)

    Memory Address Translation Unit (MTU) IPRs 5.2.8 Formatted Virtual Address (VA_FORM) Register (207) VA_FORM is a read-only register containing the virtual page table entry (PTE) address calculated as a function of the faulting virtual address and the virtual page table base (VA and MVPTBR registers). This is done as a performance enhancement to the Dstream TBmiss PAL flow.
  • Page 186: Mtu Virtual Page Table Base (Mvptbr) Register

    Memory Address Translation Unit (MTU) IPRs Table 5–15 describes the VA_FORM register fields. Table 5–15 Formatted Virtual Address Register Fields Name Extent Type Description NT_Mode=0 VPTB <63:33> Virtual page table base address as stored in MVPTBR. VA<42:13> <32:03> Subset of the original faulting virtual address. NT_Mode=1 VPTB <63:30>...
  • Page 187: Dcache Parity Error Status (Dc_Perr_Stat) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.10 Dcache Parity Error Status (DC_PERR_STAT) Register (212) DC_PERR_STAT is a read/write register that locks and stores Dcache parity error status. The VA, VA_FORM, and MM_STAT registers are locked against further updates until software reads the VA register. If a Dcache parity error is detected while the Dcache parity error status register is unlocked, the error status is loaded into DC_PERR_STAT<05:02>.
  • Page 188: Dcache Parity Error Status Register Fields

    Memory Address Translation Unit (MTU) IPRs Table 5–16 Dcache Parity Error Status Register Fields Name Extent Type Description <00> Set if second Dcache parity error occurred in a cycle after the register was locked. The SEO bit is not set as a result of a second parity error that occurs within the same cycle as the first.
  • Page 189: Dstream Translation Buffer Invalidate Single (Dtb_Is) Register (20B)

    Memory Address Translation Unit (MTU) IPRs 5.2.13 Dstream Translation Buffer Invalidate Single (DTB_IS) Register (20B) DTB_IS is a write-only register. Writing a virtual address to this register invalidates the DTB entry that meets either of the following criteria: • A DTB entry whose VA field matches DTB_IS<42:13> and whose ASN field matches DTB_ASN<63:57>.
  • Page 190: Mtu Control (Mcsr) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.14 MTU Control (MCSR) Register (20F) MCSR is a read/write register that controls features and records status in the MTU. This register is cleared on chip reset but not on timeout reset. Figure 5–39 and Table 5–17 describe the MCSR register format.
  • Page 191: Mtu Control Register Fields

    Memory Address Translation Unit (MTU) IPRs Table 5–17 MTU Control Register Fields Name Extent Type Description M_BIG_ <00> RW,0 MTU Big Endian mode enable. When set, bit 2 ENDIAN of the physical address is inverted for all long- word Dstream references. SP<1:0>...
  • Page 192: Dcache Mode (Dc_Mode) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.15 Dcache Mode (DC_MODE) Register (216) DC_MODE is a read/write register that controls diagnostic and test modes in the Dcache. This register is cleared on chip reset but not on timeout reset. Figure 5–40 and Table 5–18 describe the DC_MODE register format.
  • Page 193: Dcache Mode Register Fields

    Memory Address Translation Unit (MTU) IPRs Table 5–18 Dcache Mode Register Fields Name Extent Type Description DC_ENA <00> RW,0 Software Dcache enable. When set, the DC_ENA bit enables the Dcache. When clear, the Dcache command is not updated by ST or FILL operations, and all LD operations are forced to miss in the Dcache.
  • Page 194: Miss Address File Mode (Maf_Mode) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.16 Miss Address File Mode (MAF_MODE) Register (217) MAF_MODE is a read/write register that controls diagnostic and test modes in the MTU miss address file (MAF). This register is cleared on chip reset. MAF_MODE<05> is also cleared on timeout reset. Figure 5–41 and Table 5–19 describe the MAF_MODE register format.
  • Page 195: Miss Address File Mode Register Fields

    Memory Address Translation Unit (MTU) IPRs Table 5–19 Miss Address File Mode Register Fields (Sheet 1 of 2) Name Extent Type Description DREAD_ <00> RW,0 Miss address file (MAF) DREAD Merge Disable. When set, NOMERGE this bit disables all merging in the DREAD portion of the MAF.
  • Page 196 Memory Address Translation Unit (MTU) IPRs Table 5–19 Miss Address File Mode Register Fields (Sheet 2 of 2) Name Extent Type Description <07> This bit indicates the status of the MAF WB file. When set, PENDING there are one or more outstanding WB requests in the MAF file.
  • Page 197: Alternate Mode (Alt_Mode) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.17 Dcache Flush (DC_FLUSH) Register (210) DC_FLUSH is a write-only register. A write operation to this register clears all the valid bits in both banks of the Dcache. 5.2.18 Alternate Mode (ALT_MODE) Register (20C) ALT_MODE is a write-only register that specifies the alternate processor mode used by some HW_LD and HW_ST instructions.
  • Page 198: Cycle Counter (Cc) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.19 Cycle Counter (CC) Register (20D) CC is a read/write register. The 21164PC supports it as described in the Alpha AXP Architecture Reference Manual. The low half of the counter, when enabled, incre- ments once each CPU cycle. The upper half of the CC register is the counter offset. An HW_MTPR instruction writes CC<63:32>.
  • Page 199: Cycle Counter Control (Cc_Ctl) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.20 Cycle Counter Control (CC_CTL) Register (20E) CC_CTL is a write-only register that writes the low 32 bits of the cycle counter to enable or disable the counter. Bits CC<31:04> are written with the value in CC_CTL<31:04>...
  • Page 200: Dcache Test Tag Control (Dc_Test_Ctl) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.21 Dcache Test Tag Control (DC_TEST_CTL) Register (213) DC_TEST_CTL is a read/write register used exclusively for testing and diagnostics. An address written to this register is used to index into the Dcache array when read- ing or writing to the DC_TEST_TAG register.
  • Page 201 Memory Address Translation Unit (MTU) IPRs Table 5–22 Dcache Test Tag Control Register Fields (Sheet 2 of 2) Name Extent Type Description DATA <13> Data for Dcache soft repair. When set, a logic level 1 for the programmable soft repair fuses is sent to the Dcache. When clear, a logic level 0 is sent to the Dcache.
  • Page 202: Dcache Test Tag (Dc_Test_Tag) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.22 Dcache Test Tag (DC_TEST_TAG) Register (214) DC_TEST_TAG is a read/write register used exclusively for testing and diagnostics. When DC_TEST_TAG is read, the value in the DC_TEST_CTL register is used to index into the Dcache. The value in the tag, tag parity, valid, and data parity bits for that index are read out of the Dcache and loaded into the DC_TEST_TAG_TEMP register.
  • Page 203: Dcache Test Tag Register Fields

    Memory Address Translation Unit (MTU) IPRs Table 5–23 Dcache Test Tag Register Fields Name Extent Type Description TAG_PARITY <02> Tag parity. This bit refers to the Dcache tag parity bit that covers tag bits 32 through 13 (valid bits not covered). OW0_VALID <11>...
  • Page 204: Dcache Test Tag Temporary (Dc_Test_Tag_Temp) Register

    Memory Address Translation Unit (MTU) IPRs 5.2.23 Dcache Test Tag Temporary (DC_TEST_TAG_TEMP) Register (215) DC_TEST_TAG_TEMP is a read-only register used exclusively for testing and diagnostics. Reading the Dcache tag array requires a two-step read process: 1. The first read operation from DC_TEST_TAG reads the tag array and data parity bits and loads them into the DC_ TEST_TAG_TEMP register.
  • Page 205: Dcache Test Tag Temporary Register Fields

    Memory Address Translation Unit (MTU) IPRs Table 5–24 Dcache Test Tag Temporary Register Fields Name Extent Type Description TAG_PARITY <02> Tag parity. This bit refers to the Dcache tag parity bit that covers tag bits 32 through 13 (valid bits not covered).
  • Page 206: Cbu Internal Processor Register Descriptions

    External Interface Control (CBU) IPRs 5.3 External Interface Control (CBU) IPRs Table 5–25 lists specific IPRs for controlling Bcache, system configuration, and log- ging error information. These IPRs cannot be read or written from the system. They are placed in the 1MB region of 21164PC-specific I/O address space ranging from FF FFF0 0000 to FF FFFF FFFF.
  • Page 207: Cbu Configuration (Cbox_Config) Register

    External Interface Control (CBU) IPRs 5.3.1 CBU Configuration (CBOX_CONFIG) Register (FF FFF0 0008) CBOX_CONFIG is a read/write register that controls Bcache activity. Figure 5–48 and Table 5–26 describe the CBOX_CONFIG register format. The bits in this regis- ter are initialized to the value indicated in Table 5–26 on reset, but not on timeout reset.
  • Page 208 External Interface Control (CBU) IPRs Table 5–26 CBU Configuration Register Fields (Sheet 2 of 3) Name Extent Type Description <13:12> RW,0 This field is used to indicate the size of the Bcache. SIZE<1:0> At power-up, this field is initialized to a value that represents a 512KB Bcache.
  • Page 209 External Interface Control (CBU) IPRs Table 5–26 CBU Configuration Register Fields (Sheet 3 of 3) Name Extent Type Description BC_FILL_ <22:20> RW,1 This offset field represents the additional number of DLY_ CPU cycles to delay the st_clk when processing OFF<2:0> FILL commands.
  • Page 210: Cbu Address (Cbox_Addr) Register

    External Interface Control (CBU) IPRs 5.3.2 CBU Address (CBOX_ADDR) Register (FF FFF0 0088) CBOX_ADDR is a read-only register that contains the physical address associated with errors reported by the CBOX_STATUS register. Its contents is meaningful only when one of the error bits is set. A read of CBOX_STATUS unlocks the CBOX_ADDR register.
  • Page 211: Cbu Status (Cbox_Status) Register

    External Interface Control (CBU) IPRs 5.3.3 CBU Status (CBOX_STATUS) Register (FF FFF0 0108) CBOX_STATUS is a read-only register. It is locked when any of the error bits are set. Additional errors set the MULTI_ERR error bit in CBOX_STATUS. A read of CBOX_STATUS unlocks and clears CBOX_STATUS and unlocks CBOX_ADDR.
  • Page 212 External Interface Control (CBU) IPRs Table 5–28 CBU Status Register Fields (Sheet 2 of 2) Name Extent Type Description TAG_DIRTY <17> RO,0 This bit is the value of the TAG_DIRTY bit for the failing address. If set, the data had been modified and not written to memory.
  • Page 213: Cbu Configuration #2 (Cbox_Config2) Register

    External Interface Control (CBU) IPRs 5.3.4 CBU Configuration #2 (CBOX_CONFIG2) Register (FF FFF0 0188) CBOX_CONFIG2 is a read/write register that controls Bcache and memory, the per- formance counters, and the debug test port. Figure 5–51 and Table 5–29 describe the CBOX_CONFIG2 register format.
  • Page 214 External Interface Control (CBU) IPRs Table 5–29 CBU Configuration #2 Register Fields (Sheet 2 of 3) Name Extent Type Description DBG_SEL=0 DBG_SEL=1 spa req spc != NOP replay req scc code<0> io_wr or scc code<1> rmv req BC_THREE_MISS <6> RW,0 Allow three read misses to be launched to the system. This feature assumes the system can guarantee that fills can be returned in order.
  • Page 215 External Interface Control (CBU) IPRs Table 5–29 CBU Configuration #2 Register Fields (Sheet 3 of 3) Name Extent Type Description PM1_MUX<2:0> <13:11> RW,0 This field selects the CBU events used for performance counter #1. PM1_MUX <2:0> Counter 1 is used to count: Bcache Dstream read requests (the total num- ber of Dstream read requests from the MTU).
  • Page 216: Palcode Storage Registers

    PALcode Storage Registers 5.4 PALcode Storage Registers The 21164PC IEU register file has eight extra registers that are called the PALshadow registers. The PALshadow registers overlay R8 through R14 and R25 when the CPU is in PALmode and ICSR<SDE> is set. Thus, PALcode can consider R8 through R14 and R25 as local scratch.
  • Page 217: Cbu Ipr Palcode Restrictions

    Restrictions Table 5–30 CBU IPR PALcode Restrictions (Sheet 2 of 2) Condition Restriction Any undefined CBU IPR address. No store instructions. Bcache in force hit mode. No STx_C to cacheable space. Clearing of BC_FORCE_HIT in Must be followed by MB, read operation of CBOX_CONFIG.
  • Page 218 Restrictions Table 5–31 PALcode Restrictions Table (Sheet 2 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC Any store instruction No HW_MFPR DC_PERR_STAT in 1,2. No HW_MFPR MAF_MODE in 1,2 (WB_PENDING may not be updated).
  • Page 219 Restrictions Table 5–31 PALcode Restrictions Table (Sheet 3 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC HW_MTPR ICSR: SPE, If HW_REI_STALL, then no HW_REI_STALL in 0,1. If HW_REI, then no HW_REI in 0,1,2,3,4. HW_MTPR ICSR: SPE Must flush Icache.
  • Page 220 Restrictions Table 5–31 PALcode Restrictions Table (Sheet 4 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC HW_MTPR No load or store instructions in 1. DC_PERR_STAT No HW_MFPR DC_PERR_STAT in 1,2. HW_MTPR No HW_MFPR DC_TEST_TAG in 1,2,3.
  • Page 221 Restrictions Table 5–31 PALcode Restrictions Table (Sheet 5 of 5) Y if checked The following in cycle 0: Restrictions (Note: Numbers refer to cycle number): by PVC HW_MFPR ITB_PTE No HW_MFPR ITB_PTE_TEMP in 1,2,3. HW_MFPR No outstanding DC fills in 0. DC_TEST_TAG No HW_MFPR DC_TEST_TAG_TEMP issued or slotted in 1.
  • Page 223: Palcode Description

    Privileged Architecture Library Code This chapter describes the 21164PC privileged architecture library code (PALcode). The chapter is organized as follows: • PALcode description • PALmode environment • Invoking PALcode • PALcode entry points • Required PALcode function codes • 21164PC implementation of the architecturally reserved opcodes 6.1 PALcode Description Privileged architecture library code (PALcode) is macrocode that provides an archi- tecturally defined operating-system-specific programming interface that is common...
  • Page 224: Palmode Environment

    PALmode Environment • CALL_PAL instructions PALcode has characteristics that make it appear to be a combination of microcode, ROM BIOS, and system service routines, though the analogy to any of these other items is not exact. PALcode exists for several major reasons: •...
  • Page 225: Invoking Palcode

    Invoking PALcode • The program has privileged access to all the computer hardware. Most of the functions handled by PALcode are privileged and need control of the lowest lev- els of the system. • Interrupts are disabled. If a long sequence of instructions need to be executed atomically, interrupts cannot be allowed.
  • Page 226 Invoking PALcode behaves as if the PC were still longword aligned for the purposes of Istream fetch and execute. On HW_REI, the new state of PALmode is copied from EXC_ADDR<00>. When an event occurs that needs to invoke PALcode, the 21164PC first drains the pipeline.
  • Page 227: Palcode Entry Points

    PALcode Entry Points 6.4 PALcode Entry Points PALcode is invoked at specific entry points. The 21164PC has two types of PAL- code entry points: CALL_PAL and traps. 6.4.1 CALL_PAL Entry CALL_PAL entry points are used whenever the IDU encounters a CALL_PAL instruction in the instruction stream (Istream).
  • Page 228: Palcode Trap Entry Points

    PALcode Entry Points • PC<05:01> = 0 • PC<00> = 1 (PALmode) The minimum number of cycles for a CALL_PAL execution is four. Number of Cycles Description Minimum TRAPB for empty pipe. Typically this will be four cycles. Issue the CALL_PAL instruction. The minimum length of a PAL flow.
  • Page 229: Required Palcode Function Codes

    Required PALcode Function Codes Table 6–1 PALcode Trap Entry Points (Sheet 2 of 2) Entry Name Offset Description MCHK 0400 Uncorrected hardware error OPCDEC 0480 Illegal opcode ARITH 0500 Arithmetic exception 0580 Floating-point operation attempted with: • Floating-point instructions (LD, ST, and operates) disabled through FPE bit in the ICSR IPR •...
  • Page 230: Hw_Ld Instruction

    21164PC Implementation of the Architecturally Reserved Opcodes These architecturally reserved opcodes contain different options to the Note: 21064 opcodes of the same names. Table 6–3 Opcodes Reserved for PALcode 21164PC Architecture Mnemonic Opcode Mnemonic Function HW_LD PAL1B Performs Dstream load instructions. HW_ST PAL1F Performs Dstream store instructions.
  • Page 231: Hw_Ld Instruction Format

    21164PC Implementation of the Architecturally Reserved Opcodes Figure 6–1 HW_LD Instruction Format OPCODE DISP LOCK VPTE QUAD WRTCK PHYS LJ-03469.AI4 Table 6–4 HW_LD Format Description Field Value Description OPCODE The OPCODE field contains 1B — Destination register number. — Base register for memory address. PHYS The effective address for the HW_LD is virtual.
  • Page 232: Hw_St Instruction

    21164PC Implementation of the Architecturally Reserved Opcodes 6.6.2 HW_ST Instruction PALcode uses the HW_ST instruction to access memory outside of the realm of nor- mal Alpha memory management and to do special forms of Dstream store instruc- tions. Figure 6–2 and Table 6–5 describe the format and fields of the HW_ST instruction.
  • Page 233: Hw_Rei Instruction

    21164PC Implementation of the Architecturally Reserved Opcodes 6.6.3 HW_REI Instruction The HW_REI instruction is used to return instruction flow to the PC pointed to by the EXC_ADDR IPR. The value in EXC_ADDR<0> will be used as the new value of PALmode after the HW_REI instruction. The IDU uses the return prediction stack to speed the execution of HW_REI.
  • Page 234: Hw_Mfpr And Hw_Mtpr Instructions

    21164PC Implementation of the Architecturally Reserved Opcodes 6.6.4 HW_MFPR and HW_MTPR Instructions The HW_MFPR and HW_MTPR instructions are used to access internal state from the IDU, MTU, and Dcache. The HW_MFPR from IDU IPRs has a latency of one cycle (HW_MFPR in cycle n results in data available to the using instruction in cycle n+1).
  • Page 235: Initialization And Configuration

    Initialization and Configuration This chapter provides information on 21164PC-specific microprocessor/system ini- tialization and configuration. It is organized as follows: • Input signals sys_reset_l and dc_ok_h and booting • sysclk ratio and delay • Built-in self-test (BiSt) • Serial read-only memory (SROM) interface port •...
  • Page 236: Input Signals Sys_Reset_L And Dc_Ok_H And Booting

    Input Signals sys_reset_l and dc_ok_h and Booting After power has reached the proper operating point, signal dc_ok_h must be asserted. Then, signal sys_reset_l must be deasserted. At this point, the 21164PC recognizes a powered-up state. If signal dc_ok_h is not asserted, signal sys_reset_l is forced asserted internally.
  • Page 237: 21164Pc Signal Pin Reset State

    Input Signals sys_reset_l and dc_ok_h and Booting Table 7–1 provides the reset state of each external signal pin. Table 7–1 21164PC Signal Pin Reset State (Sheet 1 of 3) Signal Reset State Clocks clk_mode_h<1:0> NA (input). cpu_clk_out_h Clock output. osc_clk_in_h,l Must be clocking.
  • Page 238 Input Signals sys_reset_l and dc_ok_h and Booting Table 7–1 21164PC Signal Pin Reset State (Sheet 2 of 3) Signal Reset State System Interface addr_h<39:4> Driven or tristated depending upon addr_bus_req_h at most recent sysclk edge. If driven, the value is unspecified. addr_bus_req_h NA (input).
  • Page 239: Pin State With Dc_Ok_H Not Asserted

    Input Signals sys_reset_l and dc_ok_h and Booting Table 7–1 21164PC Signal Pin Reset State (Sheet 3 of 3) Signal Reset State srom_data_h NA (input). srom_oe_l Deasserted. srom_present_l NA (input). tck_h NA (input). tdi_h NA (input). tdo_h NA (input). temp_sense NA (input). test_status_h<1>...
  • Page 240: Sysclk Ratio And Delay

    sysclk Ratio and Delay 7.2 sysclk Ratio and Delay While in reset, the 21164PC reads sysclk configuration parameters from the interrupt signal pins. These inputs should be driven with the correct configuration values whenever sys_reset_l is asserted. Refer to Section 4.2.2 and Section 4.2.3 for rele- vant input signals and ratio/delay values.
  • Page 241: Serial Instruction Cache Load Operation

    Serial Read-Only Memory Interface Port If srom_present_l is asserted during setup, then the system performs an SROM load as follows: 1. The srom_oe_l signal supplies the output enable to the SROM. 2. The srom_clk_h signal supplies the clock to the ROM that causes it to advance to the next bit.
  • Page 242: Serial Terminal Port

    Serial Terminal Port srom_data_h serial input -> BHT Array 0 -> 1 -> ... -> 7 -> Data 127 -> 95 -> 126 -> 94 -> ... -> 96 -> 64 -> Predecodes 19 -> 14 -> 18 -> 13 -> ... -> 15 -> 10 -> Data parity 1 ->...
  • Page 243: Icache Initialization

    External Interface Initialization 7.6.1 Icache Initialization The Icache is not kept coherent with memory. When it is necessary to make it coher- ent with memory, the following procedure is used. The CALL_PAL IMB function performs this function by using this procedure. 1.
  • Page 244: Internal Processor Register Reset State

    Internal Processor Register Reset State 7.8 Internal Processor Register Reset State Many IPR bits are not initialized by reset. They are located in error-reporting regis- ters and other IPR states. They must be initialized by initialization PALcode. Table 7–2 lists the state of all internal processor registers (IPRs) immediately follow- ing reset.
  • Page 245 Internal Processor Register Reset State Table 7–2 Internal Processor Register Reset State (Sheet 2 of 3) Reset State Comments INTID UNDEFINED ASTRR UNDEFINED PALcode must initialize. ASTER UNDEFINED PALcode must initialize. SIRR UNDEFINED PALcode must initialize. HWINT_CLR UNDEFINED PALcode must initialize. UNDEFINED SL_XMIT Cleared...
  • Page 246: Timeout Reset

    Timeout Reset Table 7–2 Internal Processor Register Reset State (Sheet 3 of 3) Reset State Comments MCSR Cleared Cleared on chip reset but not on timeout reset. DC_MODE Cleared Cleared on chip reset but not on timeout reset. MAF_MODE Cleared Cleared on chip reset.
  • Page 247: Ieee 1149.1 Test Port Reset

    IEEE 1149.1 Test Port Reset 7.10 IEEE 1149.1 Test Port Reset Signal trst_l must be asserted when sys_reset_l is asserted or when dc_ok_h is deasserted. Continuous trst_l assertion during normal operation is used to guarantee that the IEEE 1149.1 test port does not affect 21164PC operation. 29 September 1997 –...
  • Page 249: Error Detection And Error Handling

    Error Detection and Error Handling This chapter provides an overview of the error handling strategy of the 21164PC. Each internal cache (instruction cache [Icache] and data cache [Dcache]) implements parity protection for tag and data. Longword parity protection is implemented for memory and backup cache (Bcache) data.
  • Page 250: Dcache Data Parity Error

    Error Flows The Icache is not flushed by hardware in this event. If an Icache parity Note: error occurs early in the PALcode routine at the machine check entry point, an infinite loop may result. • Recommendation: Flush the Icache early in the MCHK routine. 8.1.2 Dcache Data Parity Error •...
  • Page 251: Istream Data Parity Errors (Bcache Or Memory)

    Error Flows • Probably will not be able to recover by deleting a single process, because exact address is unknown, and a load may have falsely hit. 8.1.4 Istream Data Parity Errors (Bcache or Memory) • Machine check occurs before the instruction causing the error is executed. •...
  • Page 252: Bcache Tag Parity Errors-Istream

    Error Flows • CBOX_STATUS: <MEMORY> is set if source of fill data is memory/system; is clear if source is Bcache. • CBOX_ADDR: Contains the physical address bits <39:04> of the octaword associated with the error. 8.1.6 Bcache Tag Parity Errors—Istream •...
  • Page 253: System Read Operations Of The Bcache

    Error Flows • CBOX_ADDR: Contains the physical address bits <39:04> of the octaword associated with the error. 8.1.8 System Read Operations of the Bcache The 21164PC does not check the parity on outgoing Bcache data. If it is bad, the receiving processor will detect it.
  • Page 254: Mchk Flow

    MCHK Flow to determine if certain hardware is present). The purpose of this error detection mechanism is to attempt to prevent system hang in order to write a machine check stack frame. • ICPERR_STAT: <TMR> is set. 8.2 MCHK Flow The following flow is the recommended IPR access order to determine the source of a machine check.
  • Page 255: Mck_Interrupt Flow

    MCK_INTERRUPT Flow • If none of the previous conditions are true, then there is either an IRD that can be retried or the source of the MCHK is a fill_error_h. Add code for query of sys- tem status. • The case can be retried if any one or several of the following are true (and none of the previous conditions were true): –...
  • Page 257: 21164Pc Absolute Maximum Ratings

    Electrical Data This chapter describes the electrical characteristics of the 21164PC component and its interface pins. It is organized as follows: • Electrical characteristics • dc characteristics • Clocking scheme • ac characteristics • Power supply considerations 9.1 Electrical Characteristics Table 9–1 lists the maximum ratings for the 21164PC and Table 9–2 lists the operat- ing voltages.
  • Page 258: Operating Voltages

    DC Characteristics Table 9–1 21164PC Absolute Maximum Ratings (Sheet 2 of 2) Characteristics Ratings −0.5 V to 4.6 V Signal input or output applied Typical Vdd worst case power @ Vdd = 3.3 V Frequency = 400 MHz 2.5 W Frequency = 466 MHz 2.5 W Frequency = 533 MHz...
  • Page 259: Cmos Dc Input/Output Characteristics

    DC Characteristics Vclamp will be clamped to Vclamp provided that the current does not exceed Iclamp. The 21164PC may be damaged if the voltage exceeds Vclamp or the current exceeds Iclamp. 9.2.3 Output Signal Pins Output pins are ordinary 3.3-V CMOS outputs. Although output signals are rail-to- rail, timing is specified to Vdd/2.
  • Page 260 DC Characteristics Table 9–3 CMOS DC Input/Output Characteristics (Sheet 2 of 2) Parameter Requirements Symbol Description Min. Max. Units Test Conditions Iozh_pu Output with pull-up leakage — ±100 µA Vin = Vdd V current (tristate) Vclamp Maximum clamping voltage — Vdd + 1.0 V Iclamp = 100 mA Peak power supply current for...
  • Page 261: Clocking Scheme

    Clocking Scheme 9.3 Clocking Scheme The differential input clock signals osc_clk_in_h,l run at the internal frequency of the time base for the 21164PC. The output signal cpu_clk_out_h toggles with an unspecified propagation delay relative to the transitions on osc_clk_in_h,l. The 21164PC provides a system clock to run the chip synchronous to the system. The 21164PC generates and drives out a system clock, sys_clk_out1_h.
  • Page 262: Osc_Clk_In_H,L Input Network And Terminations

    Clocking Scheme Figure 9–1 osc_clk_in_h,l Input Network and Terminations Module Circuitry Onchip Circuitry osc_clk_in_h 3.5 nH 3.5 nH 50 Ohms 4.0 pF 6.0 pF 6.0 pF Differential VREF Amplifier 50 Ohms Oscillator 4.0 pF 6.0 pF 6.0 pF 50 Ohms osc_clk_in_l 3.5 nH 3.5 nH...
  • Page 263: Impedance Vs Clock Input Frequency

    Clocking Scheme allows a clock source of arbitrary dc bias to be ac coupled to the 21164PC. The peak- to-peak amplitude of the clock source must be between 0.6 V and 3.0 V. Either a square-wave or a sinusoidal source may be used. Full-rail clocks may be driven by testers.
  • Page 264: Input Clock Specification

    AC Characteristics 9.3.3 AC Coupling Using series coupling (blocking) capacitors renders the 21164PC clock input pins insensitive to the oscillator’s dc level. When connected this way, oscillators with any dc offset relative to Vss can be used provided they can drive a signal into the osc_clk_in_h,l pins with a peak-to-peak level of at least 600 mV, but no greater than 3.0 V peak-to-peak.
  • Page 265: Input/Output Pin Timing

    AC Characteristics Figure 9–3 Input/Output Pin Timing Tcycle Internal CPU Clock Tdsu 2.0 V Input Signals 0.8 V Input Timing Internal CPU Clock Output Signals Output Timing MK−1455−19 Because the speed and complexity of microprocessors has increased substantially over the years, it is necessary to change the way they are tested. Traditional assump- tions that all loads can be lumped into some accumulation of capacitance cannot be employed any more.
  • Page 266: Bcache Loop Timing

    AC Characteristics There is no source termination resistor in the 21164PC fabricated in 0.35-µm CMOS process technology. The source impedance of the driver is approximately 32 Ω ±17. The circuit is designed to deliver a TTL signal under worst-case conditions. Under light load, high drive voltages, and fast process conditions there may be considerable overdrive.
  • Page 267: Normal Output Driver Characteristics

    AC Characteristics Outgoing Bcache index and data signals are driven off the internal clock edge and the incoming Bcache tag and data signals are latched on the same internal clock edge. Table 9–6 and Table 9–7 show the output driver characteristics for the normal driver and big driver, respectively.
  • Page 268: Bcache Timing

    AC Characteristics Output pin timing is specified for lumped 40-pF and 10-pF loads for the normal driver and lumped 60-pF, 40-pF, and 10-pF loads for the big driver. In some cases, the circuit may have loads higher than 40 pF (60 pF for big driver). The 21164PC can safely drive higher loads provided the average charging or discharging current from each pin is 11 mA or less for normal output drivers or 25 mA or less for big output drivers.
  • Page 269: 21164Pc System Clock Output Timing

    AC Characteristics 9.4.2.2 sys_clk-Based Systems All timing is specified relative to the rising edge of the internal CPU clock. Table 9–8 shows 21164PC system clock sys_clk_out1_h output timing. Setup and hold times are specified independent of the relative capacitive loading of sys_clk_out1_h,l, addr_h<39:4>, data_h<127:0>, and cmd_h<3:0>...
  • Page 270: Sys_Clk System Timing

    AC Characteristics Figure 9–5 shows sys_clk system timing. Figure 9–5 sys_clk System Timing Relationship of CPU Clock and sys_clk_out1 CPU Clock Tsysd sys_clk_out1 Memory Read (Pipe_Latch Mode) sys_clk_out1 Tsysd Tsysd Tsysd CPU Clock Taod Taoh Address/Command Out Ttacksu dack Tdsu Data In Memory Read (Non-Pipe_Latch Mode) sys_clk_out1...
  • Page 271: Input Timing For Sys_Clk_Out-Based Systems

    AC Characteristics 9.4.3 Timing—Additional Signals This section lists timing for all other signals. Asynchronous Input Signals The following is a list of the asynchronous input signals: clk_mode_h<1:0> dc_ok_h sys_reset_l irq_h<3:0> mch_hlt_irq_h pwr_fail_irq_h sys_mch_chk_irq_h These signals can also be used synchronously. Miscellaneous Signals Table 9–9 and Table 9–10 list the timing for miscellaneous input-only and output- only signals.
  • Page 272: Output Timing For Sys_Clk_Out-Based Systems

    AC Characteristics Table 9–10 Output Timing for sys_clk_out-Based Systems Signal Specification Value Name Unidirectional Signals addr_res_h, int4_valid_h, srom_clk_h, Output delay Tdd + 0.2 ns Taod srom_oe_l, victim_pending_h addr_res_h, int4_valid_h, srom_clk_h, Output hold Tmdd Taoh srom_oe_l, victim_pending_h int4_valid_h Output delay Tdd + Tcycle + 0.2 ns Tdod int4_valid_h Output hold...
  • Page 273: Bcache Control Signal Timing

    AC Characteristics Signals in Table 9–11 are used to control Bcache data transfers. These signals are driven off the CPU clock. The timing of these signals does not change when switch- ing over to the sys_clk_out timing domain. Table 9–11 Bcache Control Signal Timing Signal Specification Value...
  • Page 274: Bist Timing Event -Timeline

    AC Characteristics Figure 9–6 BiSt Timing Event —Timeline Deassert BiSt Start Deassert* BiSt Done sys_reset_l (test_status_h<1:0>=01) Internal Reset (test_status_h<1:0>=00) (T%Z_RESET_B_L) The timing for deassertion of internal reset (time t , see asterisk) is valid only if an SROM is not present (indicated by keeping signal srom_present_l deasserted). If an SROM is present, the SROM load is performed once the BiSt completes.
  • Page 275: Srom Load Timing Event-Timeline

    AC Characteristics 9.4.4.2 Automatic SROM Load Timing The SROM load is triggered by the conclusion of BiSt if srom_present_l is asserted. The SROM load occurs at the internal cycle time of approximately 126 CPU cycles for srom_clk_h, but the behavior at the pins may shift slightly. Refer to Chapter 7 for more information on input signals, booting, and the SROM interface port.
  • Page 276: Serial Rom Load Timing

    AC Characteristics Figure 9–8 is a timing diagram of an SROM load sequence. Figure 9–8 Serial ROM Load Timing sys_reset_l srom_oe_l srom_clk_h srom_data_h = 4 x sysclk period + 1.1 ns 131, 072 Bits Total = 0 ns MK145507B The minimum srom_clk_h cycle = (126 − sysclk ratio) × (CPU cycle time). The maximum srom_clk_h to srom_data_h delay allowable (in order to meet the required setup time) = [126 −...
  • Page 277: Ieee 1149.1 Circuit Performance Specifications

    Power Supply Considerations Table 9–16 lists the clock test modes. Table 9–16 Clock Test Modes clk_mode_h Mode <1> <0> Notes Normal (1×) clock mode Normal (1×) clock mode Symmetrator is enabled. Clock reset Clock reset Symmetrator is enabled. 9.4.6 IEEE 1149.1 (JTAG) Performance Table 9–17 lists the standard mandated performance specifications for the IEEE 1149.1 circuits.
  • Page 278: Decoupling

    Power Supply Considerations Plus 5 V is not used in the 21164PC. The voltage difference between the Vdd pins and Vss pins must never be greater than 3.46 V, and the voltage difference between the Vddi pins and Vss pins must never be greater than 2.6 V. If the differentials exceed these limits, the 21164PC chip will be damaged.
  • Page 279: Power Supply Sequencing

    Power Supply Considerations Use capacitors that are as physically small as possible. Connect the capacitors directly to the 21164PC Vddi and Vss pins by short surface etch (0.64 cm [0.25 in] or less). The small capacitors generally have better electrical characteristics than the larger units, and will more readily fit close to the IPGA pin field.
  • Page 280 Power Supply Considerations There is no derating for shorter transient periods or lower transient voltages (for example, a 400-mV transient voltage lasting for 100 µs is not acceptable). All input and bidirectional signals are diode-clamped to Vdd and Vss. A current greater than Iclamp on an individual pin could damage the 21164PC.
  • Page 281: A At Various Airflows

    Thermal Management This chapter describes the 21164PC thermal management and thermal design consid- erations. 10.1 Operating Temperature The 21164PC is specified to operate when the temperature at the center of the heat sink (T ) is 71.8°C for 400 MHz, 69.8°C for 466 MHz, or 67.5°C for 533 MHz. Temperature (T ) should be measured at the center of the heat sink (between the two package studs).
  • Page 282: Maximum T A At Various Airflows

    Operating Temperature Table 10–2 Maximum T at Various Airflows Airflow (linear ft/min) 1000 Frequency: 400 MHz, Power: 26.5 W @Vdd = 3.3 V and @Vddi = 2.5 V with heat sink 1 (°C) — 26.8 46.6 51.9 54.6 57.2 with heat sink 2 (°C) 51.9 (includes 52×10 mm fan) Frequency: 466 MHz, Power: 30.5 W @Vdd = 3.3 V and @Vddi = 2.5 V...
  • Page 283: Heat-Sink Specifications

    Heat-Sink Specifications 10.2 Heat-Sink Specifications Figure 10–1 describes the specifications of heat sink 1. Heat sink 2 has the exact same specifications, plus an added 52×10 mm fan. Figure 10–1 Heat Sink 1 (1.870 in) 4.75 cm 4.75 cm 4.20 cm (1.655 in) (1.870 in) 2.16 cm...
  • Page 284: Thermal Design Considerations

    Thermal Design Considerations 10.3 Thermal Design Considerations Follow these guidelines for printed circuit board (PCB) component placement: • Orient the 21164PC on the PCB with the heat-sink fins aligned with the airflow direction. • Avoid preheating ambient air. Place the 21164PC on the PCB so that inlet air is not preheated by any other PCB components.
  • Page 285: Mechanical Specifications

    Mechanical Packaging Information This chapter describes the 21164PC mechanical packaging including chip package physical specifications and a signal/pin list. For heat-sink dimensions, refer to Chapter 10. 11.1 Mechanical Specifications Figure 11–1 shows the package physical dimensions without a heat sink. 29 September 1997 –...
  • Page 286: Package Dimensions

    Mechanical Specifications Figure 11–1 Package Dimensions 1.27 mm typ. 1.75 mm (.050 in) (.069 in) 2.54 mm typ. (.100 in) 4X Standoff 413X 1.40 mm (.055 in) 2X 10-32 Stud 22.86 mm (.900 in) .46 mm 6.35 mm (.018 in) (.250 in) 04 06 08 10 12 14 16 18 20 22 24 26 28 30 32 34 36 03 05 07 09 11 13 15 17...
  • Page 287: Signal Descriptions And Pin Assignment

    Signal Descriptions and Pin Assignment 11.2 Signal Descriptions and Pin Assignment This section provides detailed information about the 21164PC pinout. The 21164PC has 413 pins aligned in an interstitial pin grid array (IPGA) design. 11.2.1 Signal Pin Lists Table 11–1 lists the 21164PC signal pins and their corresponding pin grid array (PGA) locations in alphabetic order;...
  • Page 288: Alphabetic Signal Pin List

    Signal Descriptions and Pin Assignment Table 11–1 Alphabetic Signal Pin List (Sheet 2 of 4) Signal Location Signal Location Signal Location data_h<3> data_h<4> data_h<5> data_h<6> data_h<7> data_h<8> data_h<9> data_h<10> data_h<11> data_h<12> data_h<13> data_h<14> data_h<15> data_h<16> data_h<17> data_h<18> data_h<19> data_h<20> data_h<21> data_h<22>...
  • Page 289 Signal Descriptions and Pin Assignment Table 11–1 Alphabetic Signal Pin List (Sheet 3 of 4) Signal Location Signal Location Signal Location data_h<81> data_h<82> data_h<83> data_h<84> data_h<85> data_h<86> data_h<87> data_h<88> data_h<89> data_h<90> data_h<91> data_h<92> data_h<93> data_h<94> data_h<95> data_h<96> data_h<97> data_h<98> data_h<99> data_h<100>...
  • Page 290 Signal Descriptions and Pin Assignment Table 11–1 Alphabetic Signal Pin List (Sheet 4 of 4) Signal Location Signal Location Signal Location index_h<18> index_h<19> index_h<20> index_h<21> int4_valid_h<0> E35 int4_valid_h<1> int4_valid_h<2> H4 int4_valid_h<3> E1 irq_h<0> AJ27 irq_h<1> AL27 irq_h<2> AH26 irq_h<3> AN27 lw_parity_h<0>...
  • Page 291: Voltage Reference, Power, And Ground Pins

    Signal Descriptions and Pin Assignment Table 11–2 lists the voltage reference, power, and ground pins. Table 11–2 Voltage Reference, Power, and Ground Pins Signal PGA Location A3, A5, A7, A33, A35, B2, B8, B14, B24, B30, B36, C1, C37, D12, Metal planes 2, 6 D16, D22, D26, E5, E19, E33, F10, F16, F22, F28, H2, H36, K8, K30, L5, L33, P2, P8, P30, P36, S5, S33, W5, W33, Z2, Z8, Z30,...
  • Page 292: 21164Pc Top View (Pin Down)

    Signal Descriptions and Pin Assignment 11.2.2 Pin Assignment Figure 11–2 shows the 21164PC pinout from the top view with pins facing down. Figure 11–2 21164PC Top View (Pin Down) 21164PC Top View (Pin Down) 36 34 32 30 28 26 24 22 20 18 16 14 12 10 08 06 04 02 37 35 33 PCA028 Mechanical Packaging Information...
  • Page 293: 21164Pc Bottom View (Pin Up)

    Signal Descriptions and Pin Assignment Figure 11–3 shows the 21164PC pinout from the bottom view with pins facing up. Figure 11–3 21164PC Bottom View (Pin Up) 21164PC Bottom View (Pin Up) 04 06 08 10 12 14 16 18 20 22 24 26 28 30 32 34 36 03 05 07 09 11 13 15 17 21 23 25 27 29 31 33 35 37 PCA029...
  • Page 295: 21164Pc Test Port Pins

    Testability and Diagnostics This chapter describes the 21164PC user-oriented testability features. The 21164PC also has several internal testability features that are implemented for factory use only. These features are beyond the scope of this document. 12.1 Test Port Pins Table 12–1 summarizes the test port pins and their functions. Table 12–1 21164PC Test Port Pins Pin Name Type Function...
  • Page 296: Compliance Enable Inputs

    Test Interface 12.2 Test Interface The 21164PC test interface supports a serial ROM interface, a serial diagnostic ter- minal interface, and an IEEE 1149.1 test access port. These ports are available and set to normal test interface mode when port_mode_h<1:0>=00. Driving these pins to a value of anything other than 00 redefines all other test interface pins and invokes special factory test modes not covered in this document.
  • Page 297: Ieee 1149.1 Test Access Port

    Test Interface DIGITAL recommends that the trst_l pin be driven low (asserted) when Note: the JTAG (IEEE 1149.1) logic is not in use. 2. Coverage of oscillator differential input pins The two differential clock input pins, osc_clk_in_h and osc_clk_in_l, do not have any boundary-scan cells associated with them (noncompliant spec 10.4.1(b) in IEEE 1149.1–1993).
  • Page 298: Tap Controller State Machine

    Test Interface TAP Controller The TAP controller contains a state machine. It interprets IEEE 1149.1 protocols received on signal tms_h and generates appropriate clocks and control signals for the testability features under its jurisdiction. The state machine is shown in Figure 12–2. Figure 12–2 TAP Controller State Machine Test Logic Reset...
  • Page 299 Test Interface During the capture operation, the shift register stage of IR is loaded with the value 00001. This automatic load feature is useful for testing the integrity of the IEEE 1149.1 scan chain on the module. Table 12–3 Instruction Register Selected IR<4:0>...
  • Page 300: Test Status Pin

    Boundary-Scan Register 12.2.2 Test Status Pin One test status signal test_status_h<1> pin is used for extracting test status informa- tion from the chip. System reset drives the test status pin low. The default operation for test_status_h<1> is to output the IPR-written value. •...
  • Page 301: Boundary-Scan Register Organization

    Boundary-Scan Register Table 12–4 Boundary-Scan Register Organization (Sheet 1 of 3) Control Signal Name Pin Type BSR Count Cell Type Group Remarks addr_h<11:4> 260:253 io_bcell gr_6 — fill_dirty_h in_bcell — — temp_sense — None — Analog pin. test_status_h<1> io_bcell — —...
  • Page 302 Boundary-Scan Register Table 12–4 Boundary-Scan Register Organization (Sheet 2 of 3) Control Signal Name Pin Type BSR Count Cell Type Group Remarks sys_mch_chk_irq_h in_bcell — — mch_hlt_irq_h in_bcell — — pwr_fail_irq_h in_bcell — — irq_h<3:0> 237:234 in_bcell — — SPARE<2> io_bcell —...
  • Page 303 Boundary-Scan Register Table 12–4 Boundary-Scan Register Organization (Sheet 3 of 3) Control Signal Name Pin Type BSR Count Cell Type Group Remarks tag_ram_oe_l io_bcell — — victim_pending_h io_bcell — — TMIS1 Control io_bcell gr_3 — tag_dirty_h io_bcell gr_3 — tag_data_par_h io_bcell gr_3 —...
  • Page 305: Instruction Format And Opcode Notation

    Alpha Instruction Set A.1 Alpha Instruction Summary This appendix contains a summary of all Alpha architecture instructions. All values are in hexadecimal radix. Table A–1 describes the contents of the Format and Opcode columns that are in Table A–2. Table A–1 Instruction Format and Opcode Notation Instruction Format Opcode...
  • Page 306: Architecture Instructions

    Alpha Instruction Summary Qualifiers for operate instructions are shown in Table A–2. Qualifiers for IEEE and VAX floating-point instructions are shown in Tables A–5 and A–6, respectively. Table A–2 Architecture Instructions (Sheet 1 of 8) Mnemonic Format Opcode Description ADDF 15.080 Add F_floating ADDG...
  • Page 307 Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 2 of 8) Mnemonic Format Opcode Description CMOVE if ≥ zero CMOVGE 11.46 CMOVGT 11.66 CMOVE if > zero CMOVLBC 11.16 CMOVE if low bit clear CMOVLBS 11.14 CMOVE if low bit set CMOVE if ≤...
  • Page 308 Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 3 of 8) Mnemonic Format Opcode Description CVTGQ 15.0AF Convert G_floating to quadword CVTLQ 17.010 Convert longword to quadword CVTQF 15.0BC Convert quadword to F_floating CVTQG 15.0BE Convert quadword to G_floating CVTQL 17.030 Convert quadword to longword CVTQL/SV...
  • Page 309 Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 4 of 8) Mnemonic Format Opcode Description Floating branch if ≥ zero FBGE FBGT Floating branch if > zero Floating branch if ≤ zero FBLE FBLT Floating branch if < zero Floating branch if ≠ zero FBNE FCMOVEQ 17.02A...
  • Page 310 Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 5 of 8) Mnemonic Format Opcode Description LDBU Load zero-extended byte Load F_floating Load G_floating Load sign-extended longword LDL_L Load sign-extended longword locked Load quadword LDQ_L Load quadword locked LDQ_U Load unaligned quadword Load S_floating Load T_floating LDWU...
  • Page 311 Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 6 of 8) Mnemonic Format Opcode Description MSKWH 12.52 Mask word high MSKWL 12.12 Mask word low MT_FPCR 17.024 Move to floating-point control register MULF 15.082 Multiply F_floating MULG 15.0A2 Multiply G_floating MULL 13.00 Multiply longword...
  • Page 312 Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 7 of 8) Mnemonic Format Opcode Description S8SUBQ 10.3B Scaled subtract quadword by 8 SEXTB 1C.00 Store byte SEXTW 1C.01 Store word 12.39 Shift left logical 12.3C Shift right arithmetic 12.34 Shift right logical Store byte Store F_floating Store G_floating...
  • Page 313: Opcodes Reserved For Digital

    Alpha Instruction Summary Table A–2 Architecture Instructions (Sheet 8 of 8) Mnemonic Format Opcode Description UMULH 13.30 Unsigned multiply quadword high UNPKBL 1C.35 Unpack bytes to longwords UNPKBW 1C.34 Unpack bytes to words 18.44 Write memory barrier 11.40 Logical difference 12.30 Zero bytes ZAPNOT...
  • Page 314: Opcodes Reserved For Palcode

    IEEE Floating-Point Instructions A.1.2 Opcodes Reserved for PALcode Table A–4 lists the 21164-specific instructions. For more information, refer to Section 6.6. Table A–4 Opcodes Reserved for PALcode 21164 Architecture Mnemonic Opcode Mnemonic Function HW_LD PAL1B Performs Dstream load instructions. HW_ST PAL1F Performs Dstream store instructions.
  • Page 315 IEEE Floating-Point Instructions Table A–5 IEEE Floating-Point Instruction Function Codes (Sheet 2 of 3) Mnemonic None DIVT MULS MULT SUBS SUBT Mnemonic /SUC /SUM /SUD /SUI /SUIC /SUIM /SUID ADDS ADDT CMPTEQ — — — — — — — CMPTLT —...
  • Page 316: Vax Floating-Point Instruction Function Codes

    VAX Floating-Point Instructions Table A–5 IEEE Floating-Point Instruction Function Codes (Sheet 3 of 3) Mnemonic None /SVC /SVI /SVIC CVTTQ Mnemonic /SVD /SVID /SVM /SVIM CVTTQ Note: Because underflow cannot occur for CMPTxx, there is no difference in function or performance between CMPTxx/S and CMPTxx/SU. It is intended that software generate CMPTxx/SU in place of CMPTxx/S.
  • Page 317: A.4 Opcode Summary

    Opcode Summary Table A–6 VAX Floating-Point Instruction Function Codes (Sheet 2 of 2) Mnemonic None /SUC DIVG MULF MULG SUBF SUBG Mnemonic None /SVC CVTGQ A.4 Opcode Summary Table A–7 lists all Alpha opcodes from 00 (CALL_PAL) through 3F (BGT). In the table, the column headings that appear over the instructions have a granularity of 8 The rows beneath the Offset column supply the individual hexadecimal number to resolve that granularity.
  • Page 318: Opcode Summary

    Opcode Summary The instruction format is listed under the instruction symbol. Table A–7 Opcode Summary Offset 00 PAL* INTA* MISC* BLBC (pal) (mem) (op) (mem) (mem) (mem) (br) (br) LDAH INTL* \PAL\ FBEQ (mem) (op) (mem) (mem) (br) (br) LDBU INTS* JSR* LDL_L...
  • Page 319: Required Palcode Function Codes

    Required PALcode Function Codes A.5 Required PALcode Function Codes The opcodes listed in Table A–8 are required for all Alpha implementations. The notation used is oo.ffff, where oo is the hexadecimal 6-bit opcode and ffff is the hexadecimal 26-bit function code. Table A–8 Required PALcode Function Codes Mnemonic Type...
  • Page 320 21164PC Microprocessor IEEE Floating-Point Conformance The divide-by-zero trap is always enabled. If the trap occurs, then the destination register is UNPREDICTABLE. For VAX architecture format, this exception is signaled whenever the numerator is valid and the denominator is zero. For IEEE format, this exception is signaled whenever the numerator is valid and nonzero, with a denominator of ±0.
  • Page 321 21164PC Microprocessor IEEE Floating-Point Conformance If a CVTQL results in an integer overflow (IOV), then FPCR<INE> is automati- cally set. (The INE trap is never signaled to the IDU because there is no CVTQL opcode that enables the inexact trap.) •...
  • Page 323: 21164Pc Microprocessor Specifications

    21164PC Microprocessor Specifications Table B–1 lists specifications for the 21164PC. Table B–1 21164PC Microprocessor Specifications (Sheet 1 of 2) Feature Description Cycle time range 2.50 ns (400 MHz) to 1.87 ns (533 MHz) Process technology 0.35-µm CMOS Transistor count 3.5 million Die size 8.65 ×...
  • Page 324 Table B–1 21164PC Microprocessor Specifications (Sheet 2 of 2) Feature Description Onchip L1 Dcache 8KB, physical, direct-mapped, write-through, 32-byte block, 32-byte fill Onchip L1 Icache 16KB, virtual, direct-mapped, 64-byte block, 32-byte fill, 128 address space numbers (ASNs) (MAX_ASN=127) Onchip data 64-entry, fully associative, not-last-used replacement, 8K pages, translation buffer 128 ASNs (MAX_ASN=127), full granularity hint support...
  • Page 325 Serial Icache Load Predecode Values The following C code calculates the predecode values of a serial Icache load. A soft- ware tool called the SROM Packer converts a binary image into a format suitable for Icache serial loading. This tool is available from DIGITAL. #include <stdio.h>...
  • Page 326 int BHTfillmap[8] = { /* BHT vector 0:7 -- BHTfillmap[0:7] */ 201,200,199,198,197,196,195,194 /* 0:7 */ int predfillmap[20] = { /* predecodes 0:19 -- predfillmap[0:19] */ 108,110,112,114,116, /* 0:4 */ 109,111,113,115,117, /* 5:9 */ 120,122,124,126,128, /* 10:14 */ 121,123,125,127,129 /* 15:19 */ int octawpfillmap = /* octaword parity */ 119;...
  • Page 327 tag, predecodes, owparity; int device_size; ** define the ROM size in bits to determine the maximum number of instructions allowed ** define the number of bits per instruction for 21164PC ICache #define ROMSIZE 262144 #define B_PER_INST 64 main(int argc, char *argv[]) int i, j;...
  • Page 328 if (argc>1) strcpy(filename,argv[1]); if (argc>2) strcpy(ofilename,argv[2]); if (argc>3) strcpy(hfilename,argv[3]); if (NULL == (infile = fopen(filename,"rb"))) printf("input file open error: %s\n", filename); exit(0); if (NULL == (outfile = fopen(ofilename, "wb"))) printf("binary output file open error: %s\n", ofilename); exit(0); if (NULL == (hexfile = fopen(hfilename, "w"))) printf("hex output file open error: %s\n", hfilename);...
  • Page 329 build_vector(instr, outvector, &instatus, &instr_count); /* build the vector */ if (instr_count > MAX_INSTR){ printf("\nev5fmt Warning: input file too long.\n"); printf("\tThere are %d instructions in the input file\n", instr_count); printf("\tTruncated after %d instructions\n\n", instr_count, MAX_INSTR); fprintf(hexfile,":15%04X00",offset); chksum = (offset & 0xff) + (offset >> 8) + 0x15; for (j=0;...
  • Page 330 /* invert bit 2 to match fill scan chain attribute */ owparity ^= eparity(instr[j]); pdparity = eparity(predecodes); /* bhtvector */ for (j=0;j<7;j++) t = BHTfillmap[j]; outvector[t>>5] |= ((bhtvector >> j) & 1) << (t&0x1f); /* instructions */ for (k=0;k<4;k++) for (j=0;j<32;j++) t = dfillmap[j+k*32];...
  • Page 331 /* asm */ outvector[asmfillmap>>5] |= asm << (asmfillmap&0x1f); /* tag */ for (j=0;j<29;j++) t = tagfillmap[j]; outvector[t>>5] |= ((tag >> j) & 1) << (t&0x1f); int eparity(int x) x = x ^ (x >> 16); x = x ^ (x >> 8); x = x ^ (x >>...
  • Page 332 int ld; int store; int br; int call_pal; int bsr; int ret_rei; int jmp; int jsr_cor; int jsr; int cond_br; opcode = EXTV(inst, 31, 26 ); func = EXTV(inst, 12, 5); jsr_type = EXTV(inst, 15,14); ra = EXTV(inst,25,21); e0_only = (opcode == 0x24) || /* STF */ (opcode == 0x25) || /* STG */...
  • Page 333 (opcode == 0x3B) || /* BLE */ (opcode == 0x3C) || /* BLBS */ (opcode == 0x3D) || /* BNE */ (opcode == 0x3E) || /* BGE */ (opcode == 0x3F) || /* BGT */ (opcode == 0x1A) || /* JMP,JSR,RET,JSR_COROT */ (opcode == 0x1E) || /* HW_REI */ (opcode == 0x00) ||...
  • Page 334 (opcode == 0x23) || /* LDT */ (opcode == 0x1B); /* HW_LD */ store = (opcode == 0x24) || /* STF */ (opcode == 0x25) || /* STG */ (opcode == 0x26) || /* STS */ (opcode == 0x27) || /* STT */ (opcode == 0x0F) || /* STQ_U */...
  • Page 335 out0 = br || bsr || jmp || jsr || (ee && !ld) || (e0_only && !store); out1 = ret_rei ||(e1_only && !br_type)|| jmp ||jsr_cor|| jsr || lnoop || (fadd && !br_type) || fe;; out2 = call_pal || bsr || jsr_cor || e0_only ||jsr ||fmul || fe; out3 = (e1_only &&...
  • Page 337: Document Revision History

    Errata Sheet Table D–1 lists the revision history for this document. Table D–1 Document Revision History Date Revision September 29, 1997 Preliminary version, EC-R2W0A-TE 29 September 1997 – Subject To Change Errata Sheet D–1...
  • Page 339 Support, Products, and Documentation If you need technical support, a Digital Semiconductor Product Catalog, or help deciding which documentation best meets your needs, visit the Digital Semiconductor World Wide Web Internet site: http://www.digital.com/semiconductor You can also call the Digital Semiconductor Information Line or the Digital Semiconductor Customer Technology Center.
  • Page 340 Digital Semiconductor Products To order the Digital Semiconductor Alpha 21164PC microprocessor, contact your local distributor. The following table lists some of the semiconductor products avail- able from Digital Semiconductor. Note: The following products and order numbers might have been revised. For the latest versions, contact your local distributor.
  • Page 341 Title Order Number SPICE Models for Alpha Microprocessors and Peripheral Chips: An EC–QA4XE–TE Application Note Alpha Microprocessors SROM Mini-Debugger User’s Guide EC–QHUXC–TE Alpha Microprocessors Motherboard Debug Monitor User’s Guide EC–QHUVE–TE Alpha Microprocessors Motherboard Software Design Tools EC–QHUWC–TE User’s Guide To purchase the Alpha AXP Architecture Reference Manual, contact your local distributor or call Butterworth-Heinemann (Digital Press) at 1-800-366-2665.
  • Page 343 Glossary The glossary defines terms and spells out acronyms associated with the Alpha 21164PC microprocessor and chips in general. abort The unit stops the operation it is performing, without saving status, to perform some other operation. Advanced bipolar/CMOS technology. address space number (ASN) An optionally implemented register used to reduce the need for invalidation of cached address translations for process-specific addresses when a context switch occurs.
  • Page 344 assert To cause a signal to change to its logical true state. See asynchronous system trap. asynchronous system trap (AST) A software-simulated interrupt to a user-defined routine. ASTs enable a user process to be notified asynchronously, with respect to that process, of the occurrence of a specific event.
  • Page 345 BiSr Built-in self-repair. BiSt Built-in self-test. Binary digit. The smallest unit of data in a binary notation system, designated as 0 or Bus interface unit. See CBU. block exchange Memory feature that improves bus bandwidth by paralleling a cache victim write- back with a cache miss fill.
  • Page 346 byte Eight contiguous bits starting on an addressable byte boundary. The bits are num- bered right to left, 0 through 7. byte granularity Memory systems are said to have byte granularity if adjacent bytes can be written concurrently and independently by different processes or processors. cache See cache memory.
  • Page 347 The Alpha 21164PC microprocessor contains two onchip internal caches. See also write-through cache and write-back cache.
  • Page 348 CMOS Complementary metal-oxide semiconductor. A silicon device formed by a process that combines PMOS and NMOS semiconductor material. conditional branch instructions Instructions that test a register for positive/negative or for zero/nonzero. They can also test integer registers for even/odd. control and status register (CSR) A device or controller register that resides in the processor’s I/O space.
  • Page 349 direct memory access (DMA) Access to memory by an I/O device that does not require processor intervention. dirty One status item for a cache block. The cache block is valid and has been written so that it may differ from the copy in system main memory. dirty victim Used in reference to a cache block in the cache of a system bus node.
  • Page 350 EPLD Erasable programmable logic device. external cache A cache memory provided outside of the microprocessor chip, usually located on the same module. Also called board-level or module-level cache. FEPROM Flash-erasable programmable read-only memory. FEPROMs can be bank- or bulk- erased. Contrast with EEPROM. Field-effect transistor.
  • Page 351 granularity A characteristic of storage systems that defines the amount of data that can be read and/or written with a single instruction, or read and/or written independently. VAX systems have byte or multibyte granularities, whereas disk systems typically have 512-byte or greater granularities. For a given storage device, a higher granularity generally yields a greater throughput.
  • Page 352 The term INTnn, where nn is one of 2, 4, 8, 16, 32, or 64, refers to a data field size of nn contiguous NATURALLY ALIGNED bytes. For example, INT4 refers to a NAT- URALLY ALIGNED longword. internal processor register (IPR) One of many registers internal to the Alpha 21164PC microprocessor. IPGA Interstitial pin grid array. JFET Junction field-effect transistor.
  • Page 353 Large-scale integration. machine check An operating system action triggered by certain system hardware-detected errors that can be fatal to system operation. Once triggered, machine check handler software analyzes the error. Miss address file. main memory The large memory, external to the microprocessor, used for holding most instruction code and data.
  • Page 354 module A board on which logic devices (such as transistors, resistors, and memory chips) are mounted and connected to perform a specific system function. module-level cache See external cache. Metal-oxide semiconductor. MOSFET Metal-oxide semiconductor field-effect transistor. Medium-scale integration. Memory address translation unit. The logic unit within the 21164PC microprocessor that performs address translation, interfaces to the Dcache, and performs several other functions.
  • Page 355 NATURALLY ALIGNED data Data stored in memory such that the address of the data is evenly divisible by the size of the data in bytes. For example, an ALIGNED longword is stored such that the address of the longword is evenly divisible by 4. NMOS N-type metal-oxide semiconductor.
  • Page 356 parity A method for checking the accuracy of data by calculating the sum of the number of ones in a piece of binary data. Even parity requires the correct sum to be an even number. Odd parity requires the correct sum to be an odd number. Pin grid array.
  • Page 357 programmable array logic (PAL) A device that can be programmed by a process that blows individual fuses to create a circuit. PROM Programmable read-only memory. pull-down resistor A resistor placed between a signal line and a negative voltage. pull-up resistor A resistor placed between a signal line to a positive voltage.
  • Page 358 reliability The probability a device or system will not fail to perform its intended functions dur- ing a specified time interval when operated under stated conditions. reset An action that causes a logic unit to interrupt the task it is performing and go to its initialized state.
  • Page 359 fully associative organization, in which data from anywhere in main memory can be put anywhere in the cache. An “n-way set-associative” cache allows data from a given address in main memory to be cached in any of n locations. SIMM Single inline memory module.
  • Page 360 superpipelined Describes a pipelined machine that has a larger number of pipe stages and more complex scheduling and control. See also pipeline. superscalar Describes a machine architecture that allows multiple independent instructions to be issued in parallel during a given clock cycle. The part of a cache block that holds the address information used to determine if a memory operation is a hit or a miss on that cache block.
  • Page 361 UVPROM Ultraviolet (erasable) programmable read-only memory. valid Allocated. Valid cache blocks have been loaded with data and may return cache hits when accessed. victim Used in reference to a cache block in the cache of a system bus node. The cache block is valid but is about to be replaced due to a cache block resource conflict.
  • Page 362 WRITE BLOCK A transaction in which the 21164PC requests that an external logic unit process write data. write data wrapping System feature that reduces apparent memory latency by allowing write data cycles to differ the usual low-to-high sequence. Requires cooperation between the 21164PC and external hardware.
  • Page 363 Index Associated documentation Abbreviations ASTER register 5-20 register access ASTRR register 5-20 Aborts 2-17 Absolute Maximum Rating ac coupling Bcache 2-13 addr_bus_req_h errors 4-57 description hit under READ MISS example 4-57 operation 4-38 4-45 interface addr_cmd_par_h introduction 4-2 to 4-6 operation 4-45 9-16...
  • Page 364 Commands 21164PC initiated 4-28 BCACHE VICTIM 4-29 Cache coherency 4-13 to 4-16 INVALIDATE 4-40 flush protocol 4-14 4-28 4-40 Cache organization 2-12 READ MISS0 4-29 READ MISS1 4-29 cack_h WRITE BLOCK 4-29 description operation 4-28 4-29 4-31 4-51 4-52 Commands, sending to 21164PC 4-38 4-54 4-57...
  • Page 365 data_bus_req_h DTB_IS register 5-41 description DTB_PTE register 5-32 operation 4-46 4-47 4-48 4-50 9-13 DTB_PTE_TEMP register 5-34 data_h<127:0> DTB_TAG register 5-32 description operation 4-32 4-45 4-46 4-48 9-10 9-13 data_ram_oe_l description Entry-pointer queues 2-34 operation 4-50 9-17 EXC_ADDR register 5-12 data_ram_we_l<3:0>...
  • Page 366 Input clocks Instruction Hardware decode issue Heat sink 10-3 prefetch Hint bits 2-10 Instruction issue 2-17 HWINT_CLR register 5-22 Instruction translation buffer Instructions classes 2-19 issue rules 2-27 IC_FLUSH_CTL register 5-11 latencies 2-23 Icache 2-13 slotting 2-19 2-21 2-12 2-34 ICM register 5-15 int4_valid_h<3:0>...
  • Page 367 IPRs SIRR 5-21 SL_RCV 5-26 accessibility SL_XMIT 5-25 ALT_MODE 5-49 5-36 ASTER 5-20 VA_FORM 5-37 ASTRR 5-20 CBOX_ADDR 5-62 CBOX_CONFIG 5-59 irq_h<3:0> CBOX_CONFIG2 5-65 description 3-11 CBOX_STATUS 5-63 operation 4-60 5-23 9-15 5-50 CC_CTL 5-51 ISR register 5-23 DC_FLUSH 5-49 Issue rules 2-27 DC_MODE...
  • Page 368 Memory regions PAL_BASE register 5-15 physical 4-11 PALcode Merge PALshadow registers 5-68 write buffer 4-12 PALtemp IPRs 5-68 Merging encoding rules 2-29 Pending-request queue 2-34 Microarchitecture 2-2 to 2-13 Performance counters 2-36 MM_STAT register 5-35 Physical address considerations 4-10 2-10 Physical address regions 4-10 address translation...
  • Page 369 SIRR register 5-21 SL_RCV register 5-26 Queues SL_XMIT register 5-25 entry-pointer 2-34 Slotting 2-21 Specifications mechanical 11-1 Race conditions SROM 2-13 21164PC and system 4-51 srom_clk_h Race examples operation 5-25 9-16 9-19 9-20 12-1 idle_bc_h and cack_h 4-54 srom_data_h READ MISS transaction (no Bcache) 4-31 operation 5-26...
  • Page 370 tag_data_par_h Timing diagrams description 3-13 Bcache hit under READ MISS 4-57 operation 4-58 9-17 bus contention 4-45 FILL to private read or write 4-50 tag_dirty_h idle_bc_h and cack_h 4-54 description 3-13 READ MISS with idle_bc_h asserted 4-55 operation 9-16 READ MISS with victim 4-53 tag_ram_oe_l READ MISS with victim abort...
  • Page 371 WRITE BLOCK command 4-29 WRITE BLOCK command acknowledge 4-51 WRITE BLOCK LOCK transaction 4-37 WRITE BLOCK transaction 4-37 Write buffer 2-12 2-33 to 2-36 entry processing 2-35 Write invalidate protocol commands 4-40 Write ordering 2-36 29 September 1997 – Subject to Change Index–9...

Table of Contents